|
2 年之前 | |
---|---|---|
pics | 2 年之前 | |
README.md | 2 年之前 |
We ❤️ reading ML papers so we have created this repo to highlight the top ML papers for every week.
Paper | Links |
---|---|
1) Mastering Diverse Domains through World Models -- DreamerV3 is a general algorithm to collect diamonds in Minecraft from scratch without human data or curricula, a long-standing challenge in AI. | Paper, Tweet |
2) Tracr: Compiled Transformers as a Laboratory for Interpretability -- DeepMind proposes Tracr, a compiler for converting RASP programs into transformer weights. This way of constructing NNs weights enables the development and evaluation of new interpretability tools. | Paper, Tweet, Code |
3) Multimodal Deep Learning -- Multimodal deep learning is a new book published on ArXiv. | Book, Tweet |
4) Forecasting Potential Misuses of Language Models for Disinformation Campaigns—and How to Reduce Risk -- OpenAI publishes new work analyzing how generative LMs could potentially be misused for disinformation and how to mitigate these types of risks. | Paper, Tweet |
5) Why do Nearest Neighbor Language Models Work? -- Empirically identifies reasons why retrieval-augmented LMs (specifically k-nearest neighbor LMs) perform better than standard parametric LMs. | Paper, Code, Tweet |
6) Memory Augmented Large Language Models are Computationally Universal -- Investigates the use of existing LMs (e.g, Flan-U-PaLM 540B) combined with associative read-write memory to simulate the execution of a universal Turing machine. | Paper , Tweet |
7) A Survey on Transformers in Reinforcement Learning -- Transformers for RL will be a fascinating research area to track. The same is true for the reverse direction (RL for Transformers)... a notable example: using RLHF to improve LLMs (e.g., ChatGPT). | Paper, Tweet |
8) Scaling Laws for Generative Mixed-Modal Language Models -- Introduces scaling laws for generative mixed-modal language models. | Paper, Tweet |
9) DeepMatcher: A Deep Transformer-based Network for Robust and Accurate Local Feature Matching -- DeepMatcher is a transformer-based network showing robust local feature matching, outperforming the state-of-the-art methods on several benchmarks. | Paper, Tweet |
10) Generative Time Series Forecasting with Diffusion, Denoise, and Disentanglement -- This work addresses the time series forecasting problem with generative modeling; involves a bidirectional VAE backbone equipped with diffusion, denoising for prediction accuracy, and disentanglement for model interpretability. | Paper, Tweet |
Subscribe to our newsletter to stay on top of ML research and trends.
We use a combination of AI-powered tools, analytics, and human curation to build the lists of papers.
Paper | Links |
---|---|
1) Muse: Text-To-Image Generation via Masked Generative Transformers -- GoogleAI introduces Muse, a new text-to-image generation model based on masked generative transformers; significantly more efficient than other diffusion models like Imagen and DALLE-2. | Paper, Project, Code, Tweet |
2) VALL-E Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers -- Microsoft introduces VALL-E, a text-to-audio model that performs state-of-the-art zero-shot performance; the text-to-speech synthesis task is treated as a conditional language modeling task. | Project, Tweet |
3) Rethinking with Retrieval: Faithful Large Language Model Inference -- A new paper shows the potential of enhancing LLMs by retrieving relevant external knowledge based on decomposed reasoning steps obtained through chain-of-thought prompting. | Paper, Tweet |
4) SparseGPT: Massive Language Models Can Be Accurately Pruned In One-Shot -- Presents a technique for compressing large language models while not sacrificing performance; "pruned to at least 50% sparsity in one-shot, without any retraining." | Paper, Tweet |
5) ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders -- ConvNeXt V2 is a performant model based on a fully convolutional masked autoencoder framework and other architectural improvements. CNNs are sticking back! | Paper, Code, Tweet |
6) Large Language Models as Corporate Lobbyists -- With more capabilities, we are starting to see a wider range of applications with LLMs. This paper utilized large language models for conducting corporate lobbying activities. | Paper , Code, Tweet |
7) Superposition, Memorization, and Double Descent -- This work aims to better understand how deep learning models overfit or memorize examples; interesting phenomena observed; important work toward a mechanistic theory of memorization. | Paper, Tweet |
8) StitchNet: Composing Neural Networks from Pre-Trained Fragments -- StitchNet: Interesting idea to create new coherent neural networks by reusing pretrained fragments of existing NNs. Not straightforward but there is potential in terms of efficiently reusing learned knowledge in pre-trained networks for complex tasks. | Paper, Tweet |
9) Iterated Decomposition: Improving Science Q&A by Supervising Reasoning Processes -- Proposes integrated decomposition, an approach to improve Science Q&A through a human-in-the-loop workflow for refining compositional LM programs. | Paper, Code Tweet |
10) A Succinct Summary of Reinforcement Learning -- A nice little overview of some important ideas in RL. | Paper, Tweet |
Subscribe to our newsletter to stay on top of ML research and trends.
We use a combination of AI-powered tools, analytics, and human curation to build the lists of papers.