暫無描述

angysaravia 554b61f24b Update README.md 2 年之前
pics 64323da6e6 Add files via upload 2 年之前
README.md 554b61f24b Update README.md 2 年之前

README.md

ML Papers of The Week

We ❤️ reading ML papers so we've created this repo to highlight the top ML papers of every week.

📣 You can follow us on Twitter or subscribe to get the list of top ML papers in your inbox.

Top ML Papers of the Week (Feb 13 - 19)

My Image

Paper Links
1) Symbolic Discovery of Optimization Algorithms - Lion (EvoLved Sign Momentum) - a simple and effective optimization algorithm that’s more memory-efficient than Adam. Paper, Tweet
2) *Transformer models: an introduction and catalog** - Transformer models: an introduction and catalog. Paper, Tweet
3) 3D-aware Conditional Image Synthesis - pix2pix3D - a 3D-aware conditional generative model extended with neural radiance fields for controllable photorealistic image synthesis. Paper, Project Tweet
4) The Capacity for Moral Self-Correction in Large Language Models - Moral Self-Correction in Large Language Models - finds strong evidence that language models trained with RLHF have the capacity for moral self-correction. The capability emerges at 22B model parameters and typically improves with scale. Paper, Tweet
6) xxxx - Language Quantized AutoEncoders (LQAE) - an unsupervised method for text-image alignment that leverages pretrained language models; it enables few-shot image classification with LLMs. Paper , Project, Code Tweet
7) Augmented Language Models: a Survey - Augmented Language Models - a survey of language models that are augmented with reasoning skills and the capability to use tools. Paper, Tweet
8) Geometric Clifford Algebra Networks - Geometric Clifford Algebra Networks (GCANs) - an approach to incorporate geometry-guided transformations into neural networks using geometric algebra. Paper, Tweet
9) Auditing large language models: a three-layered approach - Auditing large language models - proposes a policy framework for auditing LLMs. Paper, Tweet
10) Energy Transformer - Energy Transformer - a transformer architecture that replaces the sequence of feedforward transformer blocks with a single large Associate Memory model; this follows the popularity that Hopfield Networks have gained in the field of ML. Paper, Tweet

Top ML Papers of the Week (Feb 6 - 12)

My Image

Paper Links
1) Toolformer: Language Models Can Teach Themselves to Use Tools - Toolformer - introduces language models that teach themselves to use external tools via simple API calls. Paper, Tweet
2) Describe, Explain, Plan and Select: Interactive Planning with Large Language Models Enables Open-World Multi-Task Agents - Describe, Explain, Plan, and Select - proposes using language models for open-world game playing. Paper, Tweet
3) A Categorical Archive of ChatGPT Failures - A Categorical Archive of ChatGPT Failures - a comprehensive analysis of ChatGPT failures for categories like reasoning, factual errors, maths, and coding. Paper, Tweet
4) Hard Prompts Made Easy: Gradient-Based Discrete Optimization for Prompt Tuning and Discovery - Hard Prompts Made Easy - optimizing hard text prompts through efficient gradient-based optimization. Paper, Tweet
5) Data Selection for Language Models via Importance Resampling - Data Selection for LMs - proposes a cheap and scalable data selection framework based on an importance resampling algorithm to improve the downstream performance of LMs. Paper, Tweet
6) Structure and Content-Guided Video Synthesis with Diffusion Models - Gen-1 - proposes an approach for structure and content-guided video synthesis with diffusion models. Paper , Project, Tweet
7) A Multitask, Multilingual, Multimodal Evaluation of ChatGPT on Reasoning, Hallucination, and Interactivity - Multitask, Multilingual, Multimodal Evaluation of ChatGPT - performs a more rigorous evaluation of ChatGPt on reasoning, hallucination, and interactivity. Paper, Tweet
8) Noise2Music: Text-conditioned Music Generation with Diffusion Models - Noise2Music - proposes diffusion models to generate high-quality 30-second music clips via text prompts. Paper, ProjectTweet
9) Offsite-Tuning: Transfer Learning without Full Model - Offsite-Tuning - introduces an efficient, privacy-preserving transfer learning framework to adapt foundational models to downstream data without access to the full model. Paper, Project, Tweet
10) Zero-shot Image-to-Image Translation - pix2pix-zero - proposes a model for zero-shot image-to-image translation. Paper, Project, Tweet

Top ML Papers of the Week (Jan 30-Feb 5)

My Image

Paper Links
1) REPLUG: Retrieval-Augmented Black-Box Language Models - REPLUG - a retrieval-augmented LM framework that adapts a retriever to a large-scale, black-box LM like GPT-3. Paper, Tweet
2) Extracting Training Data from Diffusion Models - Extracting Training Data from Diffusion Models - shows that diffusion-based generative models can memorize images from the training data and emit them at generation time. Paper, Tweet
3) The Flan Collection: Designing Data and Methods for Effective Instruction Tuning - The FLAN Collection - release a more extensive publicly available collection of tasks, templates, and methods to advancing instruction-tuned models. Paper, Tweet
4) Multimodal Chain-of-Thought Reasoning in Language Models - Multimodal Chain-of-Though Reasoning - incorporates vision features to elicit chain-of-thought reasoning in multimodality, enabling the model to generate effective rationales that contribute to answer inference. Paper, Code Tweet
5) Dreamix: Video Diffusion Models are General Video Editors - Dreamix - a diffusion model that performs text-based motion and appearance editing of general videos. Paper, Project, Tweet
6) Benchmarking Large Language Models for News Summarization - Benchmarking LLMs for news summarization. Paper , Tweet
7) Mathematical Capabilities of ChatGPT - Mathematical Capabilities of ChatGPT - investigates the mathematical capabilities of ChatGPT on a new holistic benchmark called GHOSTS. Paper, Tweet
8) Emergence of Maps in the Memories of Blind Navigation Agents - Training ‘Blind’ Agents - trains an AI agent to navigate purely by feeling its way around; no use of vision, audio, or any other sensing (as in animals). Paper, Project, Tweet
9) SceneDreamer: Unbounded 3D Scene Generation from 2D Image Collections - SceneDreamer - a generative model that synthesizes large-scale 3D landscapes from random noises. Paper, Tweet
10) Large Language Models Can Be Easily Distracted by Irrelevant Context - LLMs and irrelevant context - finds that many prompting techniques fail when presented with irrelevant context for arithmetic reasoning. Paper, Tweet

Top ML Papers of the Week (Jan 23-29)

My Image

Paper Links
1) MusicLM: Generating Music From Text - MusicLM - a generative model for generating high-fidelity music from text descriptions. Paper, Tweet
2) Hungry Hungry Hippos: Towards Language Modeling with State Space Models - H3 - an approach to reduce the gap, in terms of performance and hardware utilization, between state space models and attention for language modeling. Paper, Tweet
3) A Watermark for Large Language Models - A Watermark for LLMs - a watermarking framework for proprietary language models. Paper, Tweet
4) Text-To-4D Dynamic Scene Generation - Make-A-Video3D - a new text-to-4D model for dynamic scene generation from input text. Paper, Github, Tweet
5) ClimaX: A foundation model for weather and climate - ClimaX - a foundation model for weather and climate, including many capabilities for atmospheric science tasks. Paper, Tweet, Blog
6) Open Problems in Applied Deep Learning - If you're looking for interesting open problems in DL, this is a good reference. Not sure if intentional but it also looks useful to get a general picture of current trends in deep learning with ~300 references. Paper , Tweet
7) DetectGPT: Zero-Shot Machine-Generated Text Detection using Probability Curvature - DetectGPT - an approach for zero-shot machine-generated text detection. Uses raw log probabilities from the LLM to determine if the passage was sampled from it. Paper, Tweet
8) StyleGAN-T: Unlocking the Power of GANs for Fast Large-Scale Text-to-Image Synthesis - StyleGAN-T - a new model that aims to regain the competitiveness of GANs for fast large-scale text-to-image synthesis. Paper, Project, Code Tweet
9) StyleGAN-T: Unlocking the Power of GANs for Fast Large-Scale Text-to-Image Synthesis - ProGen - an LLM that can generate protein sequences with a predictable function across large protein families. Paper, Tweet
10) The Impossibility of Parallelizing Boosting - The Impossibility of Parallelizing Boosting - investigates the possibility of parallelizing boosting. Paper, Tweet

Top ML Papers of the Week (Jan 16-22)

My Image

Paper Links
1) Google AI Research Recap (2022 Edition) - an excellent summary of some notable research Google AI did in 2022. Blog, Tweet
2) Dissociating language and thought in large language models: a cognitive perspective - a review paper on the capabilities of LLMs from a cognitive science perspective. Paper, Tweet
3) Human-Timescale Adaptation in an Open-Ended Task Space - an agent trained at scale that leads to a general in-content learning algorithm able to adapt to open-ended embodied 3D problems. Paper, Tweet
4) AtMan: Understanding Transformer Predictions Through Memory Efficient Attention Manipulation - an approach to help provide explanations of generative transformer models through memory-efficient attention manipulation. Paper, Tweet
5) Everything is Connected: Graph Neural Networks - short overview of key concepts in graph representation learning. Paper, Tweet
6) GLIGEN: Open-Set Grounded Text-to-Image Generation - an approach that extends the functionality of existing pre-trained text-to-image diffusion models by enabling conditioning on grounding inputs. Paper, Tweet, Project
7) InstructPix2Pix: Learning to Follow Image Editing Instructions - proposes a method with the capability of editing images from human instructions. Paper, Tweet
8) Dataset Distillation: A Comprehensive Review Paper, Tweet
9) Learning-Rate-Free Learning by D-Adaptation - a new method for automatically adjusting the learning rate during training, applicable to more than a dozen diverse ML problems. Paper, Tweet
10) RecolorNeRF: Layer Decomposed Radiance Field for Efficient Color Editing of 3D Scenes - a user-friendly color editing approach for the neural radiance field to achieve a more efficient view-consistent recoloring. Paper, Tweet

Top ML Papers of the Week (Jan 9-15)

My Image

Paper Links
1) Mastering Diverse Domains through World Models - a general algorithm to collect diamonds in Minecraft from scratch without human data or curricula, a long-standing challenge in AI. Paper, Tweet
2) Tracr: Compiled Transformers as a Laboratory for Interpretability - a compiler for converting RASP programs into transformer weights. This way of constructing NNs weights enables the development and evaluation of new interpretability tools. Paper, Tweet, Code
3) Multimodal Deep Learning - multimodal deep learning is a new book published on ArXiv. Book, Tweet
4) Forecasting Potential Misuses of Language Models for Disinformation Campaigns—and How to Reduce Risk - new work analyzing how generative LMs could potentially be misused for disinformation and how to mitigate these types of risks. Paper, Tweet
5) Why do Nearest Neighbor Language Models Work? - empirically identifies reasons why retrieval-augmented LMs (specifically k-nearest neighbor LMs) perform better than standard parametric LMs. Paper, Code, Tweet
6) Memory Augmented Large Language Models are Computationally Universal - investigates the use of existing LMs (e.g, Flan-U-PaLM 540B) combined with associative read-write memory to simulate the execution of a universal Turing machine. Paper , Tweet
7) A Survey on Transformers in Reinforcement Learning - transformers for RL will be a fascinating research area to track. The same is true for the reverse direction (RL for Transformers)... a notable example: using RLHF to improve LLMs (e.g., ChatGPT). Paper, Tweet
8) Scaling Laws for Generative Mixed-Modal Language Models - introduces scaling laws for generative mixed-modal language models. Paper, Tweet
9) DeepMatcher: A Deep Transformer-based Network for Robust and Accurate Local Feature Matching - a transformer-based network showing robust local feature matching, outperforming the state-of-the-art methods on several benchmarks. Paper, Tweet
10) Generative Time Series Forecasting with Diffusion, Denoise, and Disentanglement - addresses the time series forecasting problem with generative modeling; involves a bidirectional VAE backbone equipped with diffusion, denoising for prediction accuracy, and disentanglement for model interpretability. Paper, Tweet

Top ML Papers of the Week (Jan 1-8)

My Image

Paper Links
1) Muse: Text-To-Image Generation via Masked Generative Transformers - introduces Muse, a new text-to-image generation model based on masked generative transformers; significantly more efficient than other diffusion models like Imagen and DALLE-2. Paper, Project, Code, Tweet
2) VALL-E Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers - introduces VALL-E, a text-to-audio model that performs state-of-the-art zero-shot performance; the text-to-speech synthesis task is treated as a conditional language modeling task. Project, Tweet
3) Rethinking with Retrieval: Faithful Large Language Model Inference - shows the potential of enhancing LLMs by retrieving relevant external knowledge based on decomposed reasoning steps obtained through chain-of-thought prompting. Paper, Tweet
4) SparseGPT: Massive Language Models Can Be Accurately Pruned In One-Shot - presents a technique for compressing large language models while not sacrificing performance; "pruned to at least 50% sparsity in one-shot, without any retraining." Paper, Tweet
5) ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders - a performant model based on a fully convolutional masked autoencoder framework and other architectural improvements. CNNs are sticking back! Paper, Code, Tweet
6) Large Language Models as Corporate Lobbyists - with more capabilities, we are starting to see a wider range of applications with LLMs. This paper utilized large language models for conducting corporate lobbying activities. Paper , Code, Tweet
7) Superposition, Memorization, and Double Descent - aims to better understand how deep learning models overfit or memorize examples; interesting phenomena observed; important work toward a mechanistic theory of memorization. Paper, Tweet
8) StitchNet: Composing Neural Networks from Pre-Trained Fragments - new idea to create new coherent neural networks by reusing pretrained fragments of existing NNs. Not straightforward but there is potential in terms of efficiently reusing learned knowledge in pre-trained networks for complex tasks. Paper, Tweet
9) Iterated Decomposition: Improving Science Q&A by Supervising Reasoning Processes - proposes integrated decomposition, an approach to improve Science Q&A through a human-in-the-loop workflow for refining compositional LM programs. Paper, Code Tweet
10) A Succinct Summary of Reinforcement Learning - a nice overview of some important ideas in RL. Paper, Tweet

We use a combination of AI-powered tools, analytics, and human curation to build the lists of papers.

Subscribe to our NLP Newsletter to stay on top of ML research and trends.

Join our Discord.