| Paper | Links | 
|---|---|
| 1) Toolformer: Language Models Can Teach Themselves to Use Tools - Toolformer - introduces language models that teach themselves to use external tools via simple API calls. | Paper, Tweet | 
| 2) Describe, Explain, Plan and Select: Interactive Planning with Large Language Models Enables Open-World Multi-Task Agents - Describe, Explain, Plan, and Select - proposes using language models for open-world game playing. | Paper, Tweet | 
| 3) A Categorical Archive of ChatGPT Failures - A Categorical Archive of ChatGPT Failures - a comprehensive analysis of ChatGPT failures for categories like reasoning, factual errors, maths, and coding. | Paper, Tweet | 
| 4) Hard Prompts Made Easy: Gradient-Based Discrete Optimization for Prompt Tuning and Discovery - Hard Prompts Made Easy - optimizing hard text prompts through efficient gradient-based optimization. | Paper, Tweet | 
| 5) Data Selection for Language Models via Importance Resampling - Data Selection for LMs - proposes a cheap and scalable data selection framework based on an importance resampling algorithm to improve the downstream performance of LMs. | Paper, Tweet | 
| 6) Structure and Content-Guided Video Synthesis with Diffusion Models - Gen-1 - proposes an approach for structure and content-guided video synthesis with diffusion models. | Paper , Project, Tweet | 
| 7) A Multitask, Multilingual, Multimodal Evaluation of ChatGPT on Reasoning, Hallucination, and Interactivity - Multitask, Multilingual, Multimodal Evaluation of ChatGPT - performs a more rigorous evaluation of ChatGPt on reasoning, hallucination, and interactivity. | Paper, Tweet | 
| 8) Noise2Music: Text-conditioned Music Generation with Diffusion Models - Noise2Music - proposes diffusion models to generate high-quality 30-second music clips via text prompts. | Paper, ProjectTweet | 
| 9) Offsite-Tuning: Transfer Learning without Full Model - Offsite-Tuning - introduces an efficient, privacy-preserving transfer learning framework to adapt foundational models to downstream data without access to the full model. | Paper, Project, Tweet | 
| 10) Zero-shot Image-to-Image Translation - pix2pix-zero - proposes a model for zero-shot image-to-image translation. | Paper, Project, Tweet | 
| Paper | Links | 
|---|---|
| 1) REPLUG: Retrieval-Augmented Black-Box Language Models - REPLUG - a retrieval-augmented LM framework that adapts a retriever to a large-scale, black-box LM like GPT-3. | Paper, Tweet | 
| 2) Extracting Training Data from Diffusion Models - Extracting Training Data from Diffusion Models - shows that diffusion-based generative models can memorize images from the training data and emit them at generation time. | Paper, Tweet | 
| 3) The Flan Collection: Designing Data and Methods for Effective Instruction Tuning - The FLAN Collection - release a more extensive publicly available collection of tasks, templates, and methods to advancing instruction-tuned models. | Paper, Tweet | 
| 4) Multimodal Chain-of-Thought Reasoning in Language Models - Multimodal Chain-of-Though Reasoning - incorporates vision features to elicit chain-of-thought reasoning in multimodality, enabling the model to generate effective rationales that contribute to answer inference. | Paper, Code Tweet | 
| 5) Dreamix: Video Diffusion Models are General Video Editors - Dreamix - a diffusion model that performs text-based motion and appearance editing of general videos. | Paper, Project, Tweet | 
| 6) Benchmarking Large Language Models for News Summarization - Benchmarking LLMs for news summarization. | Paper , Tweet | 
| 7) Mathematical Capabilities of ChatGPT - Mathematical Capabilities of ChatGPT - investigates the mathematical capabilities of ChatGPT on a new holistic benchmark called GHOSTS. | Paper, Tweet | 
| 8) Emergence of Maps in the Memories of Blind Navigation Agents - Training ‘Blind’ Agents - trains an AI agent to navigate purely by feeling its way around; no use of vision, audio, or any other sensing (as in animals). | Paper, Project, Tweet | 
| 9) SceneDreamer: Unbounded 3D Scene Generation from 2D Image Collections - SceneDreamer - a generative model that synthesizes large-scale 3D landscapes from random noises. | Paper, Tweet | 
| 10) Large Language Models Can Be Easily Distracted by Irrelevant Context - LLMs and irrelevant context - finds that many prompting techniques fail when presented with irrelevant context for arithmetic reasoning. | Paper, Tweet | 
| Paper | Links | 
|---|---|
| 1) MusicLM: Generating Music From Text - MusicLM - a generative model for generating high-fidelity music from text descriptions. | Paper, Tweet | 
| 2) Hungry Hungry Hippos: Towards Language Modeling with State Space Models - H3 - an approach to reduce the gap, in terms of performance and hardware utilization, between state space models and attention for language modeling. | Paper, Tweet | 
| 3) A Watermark for Large Language Models - A Watermark for LLMs - a watermarking framework for proprietary language models. | Paper, Tweet | 
| 4) Text-To-4D Dynamic Scene Generation - Make-A-Video3D - a new text-to-4D model for dynamic scene generation from input text. | Paper, Github, Tweet | 
| 5) ClimaX: A foundation model for weather and climate - ClimaX - a foundation model for weather and climate, including many capabilities for atmospheric science tasks. | Paper, Tweet, Blog | 
| 6) Open Problems in Applied Deep Learning - If you're looking for interesting open problems in DL, this is a good reference. Not sure if intentional but it also looks useful to get a general picture of current trends in deep learning with ~300 references. | Paper , Tweet | 
| 7) DetectGPT: Zero-Shot Machine-Generated Text Detection using Probability Curvature - DetectGPT - an approach for zero-shot machine-generated text detection. Uses raw log probabilities from the LLM to determine if the passage was sampled from it. | Paper, Tweet | 
| 8) StyleGAN-T: Unlocking the Power of GANs for Fast Large-Scale Text-to-Image Synthesis - StyleGAN-T - a new model that aims to regain the competitiveness of GANs for fast large-scale text-to-image synthesis. | Paper, Project, Code Tweet | 
| 9) StyleGAN-T: Unlocking the Power of GANs for Fast Large-Scale Text-to-Image Synthesis - ProGen - an LLM that can generate protein sequences with a predictable function across large protein families. | Paper, Tweet | 
| 10) The Impossibility of Parallelizing Boosting - The Impossibility of Parallelizing Boosting - investigates the possibility of parallelizing boosting. | Paper, Tweet | 
We ❤️ reading ML papers so we've created this repo to highlight the top ML papers of every week.
📣 You can follow us on Twitter or subscribe to get the list of top ML papers in your inbox.
| Paper | Links | 
|---|---|
| 1) Google AI Research Recap (2022 Edition) - an excellent summary of some notable research Google AI did in 2022. | Blog, Tweet | 
| 2) Dissociating language and thought in large language models: a cognitive perspective - a review paper on the capabilities of LLMs from a cognitive science perspective. | Paper, Tweet | 
| 3) Human-Timescale Adaptation in an Open-Ended Task Space - an agent trained at scale that leads to a general in-content learning algorithm able to adapt to open-ended embodied 3D problems. | Paper, Tweet | 
| 4) AtMan: Understanding Transformer Predictions Through Memory Efficient Attention Manipulation - an approach to help provide explanations of generative transformer models through memory-efficient attention manipulation. | Paper, Tweet | 
| 5) Everything is Connected: Graph Neural Networks - short overview of key concepts in graph representation learning. | Paper, Tweet | 
| 6) GLIGEN: Open-Set Grounded Text-to-Image Generation - an approach that extends the functionality of existing pre-trained text-to-image diffusion models by enabling conditioning on grounding inputs. | Paper, Tweet, Project | 
| 7) InstructPix2Pix: Learning to Follow Image Editing Instructions - proposes a method with the capability of editing images from human instructions. | Paper, Tweet | 
| 8) Dataset Distillation: A Comprehensive Review | Paper, Tweet | 
| 9) Learning-Rate-Free Learning by D-Adaptation - a new method for automatically adjusting the learning rate during training, applicable to more than a dozen diverse ML problems. | Paper, Tweet | 
| 10) RecolorNeRF: Layer Decomposed Radiance Field for Efficient Color Editing of 3D Scenes - a user-friendly color editing approach for the neural radiance field to achieve a more efficient view-consistent recoloring. | Paper, Tweet | 
| Paper | Links | 
|---|---|
| 1) Mastering Diverse Domains through World Models - a general algorithm to collect diamonds in Minecraft from scratch without human data or curricula, a long-standing challenge in AI. | Paper, Tweet | 
| 2) Tracr: Compiled Transformers as a Laboratory for Interpretability - a compiler for converting RASP programs into transformer weights. This way of constructing NNs weights enables the development and evaluation of new interpretability tools. | Paper, Tweet, Code | 
| 3) Multimodal Deep Learning - multimodal deep learning is a new book published on ArXiv. | Book, Tweet | 
| 4) Forecasting Potential Misuses of Language Models for Disinformation Campaigns—and How to Reduce Risk - new work analyzing how generative LMs could potentially be misused for disinformation and how to mitigate these types of risks. | Paper, Tweet | 
| 5) Why do Nearest Neighbor Language Models Work? - empirically identifies reasons why retrieval-augmented LMs (specifically k-nearest neighbor LMs) perform better than standard parametric LMs. | Paper, Code, Tweet | 
| 6) Memory Augmented Large Language Models are Computationally Universal - investigates the use of existing LMs (e.g, Flan-U-PaLM 540B) combined with associative read-write memory to simulate the execution of a universal Turing machine. | Paper , Tweet | 
| 7) A Survey on Transformers in Reinforcement Learning - transformers for RL will be a fascinating research area to track. The same is true for the reverse direction (RL for Transformers)... a notable example: using RLHF to improve LLMs (e.g., ChatGPT). | Paper, Tweet | 
| 8) Scaling Laws for Generative Mixed-Modal Language Models - introduces scaling laws for generative mixed-modal language models. | Paper, Tweet | 
| 9) DeepMatcher: A Deep Transformer-based Network for Robust and Accurate Local Feature Matching - a transformer-based network showing robust local feature matching, outperforming the state-of-the-art methods on several benchmarks. | Paper, Tweet | 
| 10) Generative Time Series Forecasting with Diffusion, Denoise, and Disentanglement - addresses the time series forecasting problem with generative modeling; involves a bidirectional VAE backbone equipped with diffusion, denoising for prediction accuracy, and disentanglement for model interpretability. | Paper, Tweet | 
| Paper | Links | 
|---|---|
| 1) Muse: Text-To-Image Generation via Masked Generative Transformers - introduces Muse, a new text-to-image generation model based on masked generative transformers; significantly more efficient than other diffusion models like Imagen and DALLE-2. | Paper, Project, Code, Tweet | 
| 2) VALL-E Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers - introduces VALL-E, a text-to-audio model that performs state-of-the-art zero-shot performance; the text-to-speech synthesis task is treated as a conditional language modeling task. | Project, Tweet | 
| 3) Rethinking with Retrieval: Faithful Large Language Model Inference - shows the potential of enhancing LLMs by retrieving relevant external knowledge based on decomposed reasoning steps obtained through chain-of-thought prompting. | Paper, Tweet | 
| 4) SparseGPT: Massive Language Models Can Be Accurately Pruned In One-Shot - presents a technique for compressing large language models while not sacrificing performance; "pruned to at least 50% sparsity in one-shot, without any retraining." | Paper, Tweet | 
| 5) ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders - a performant model based on a fully convolutional masked autoencoder framework and other architectural improvements. CNNs are sticking back! | Paper, Code, Tweet | 
| 6) Large Language Models as Corporate Lobbyists - with more capabilities, we are starting to see a wider range of applications with LLMs. This paper utilized large language models for conducting corporate lobbying activities. | Paper , Code, Tweet | 
| 7) Superposition, Memorization, and Double Descent - aims to better understand how deep learning models overfit or memorize examples; interesting phenomena observed; important work toward a mechanistic theory of memorization. | Paper, Tweet | 
| 8) StitchNet: Composing Neural Networks from Pre-Trained Fragments - new idea to create new coherent neural networks by reusing pretrained fragments of existing NNs. Not straightforward but there is potential in terms of efficiently reusing learned knowledge in pre-trained networks for complex tasks. | Paper, Tweet | 
| 9) Iterated Decomposition: Improving Science Q&A by Supervising Reasoning Processes - proposes integrated decomposition, an approach to improve Science Q&A through a human-in-the-loop workflow for refining compositional LM programs. | Paper, Code Tweet | 
| 10) A Succinct Summary of Reinforcement Learning - a nice overview of some important ideas in RL. | Paper, Tweet | 
We use a combination of AI-powered tools, analytics, and human curation to build the lists of papers.
Subscribe to our NLP Newsletter to stay on top of ML research and trends.
Join our Discord.