|
@@ -28,14 +28,13 @@ Highlighting top ML papers of the week.
|
|
|
|
|
|
| **Paper** | **Link** |
|
|
|
| ------------- | :---: |
|
|
|
-| 1. GoogleAI introduces Muse, a new text-to-image generation model based on masked generative transformers; significantly more efficient than other diffusion models like Imagen and DALLE-2. | [Paper](https://arxiv.org/abs/2301.00704) |
|
|
|
-| 2. Microsoft introduces VALL-E, a text-to-audio model that performs state-of-the-art zero-shot performance; the text-to-speech synthesis task is treated as a conditional language modeling task: | https://valle-demo.github.io/ |
|
|
|
-| 3. A new paper shows the potential of enhancing LLMs by retrieving relevant external knowledge based on decomposed reasoning steps obtained through chain-of-thought prompting. | https://arxiv.org/abs/2301.00303 |
|
|
|
-| 4. Presents a technique for compressing large language models while not sacrificing performance; "pruned to at least 50% sparsity in one-shot, without any retraining." | https://arxiv.org/pdf/2301.00774.pdf |
|
|
|
-| Content Cell | Content Cell |
|
|
|
-| Content Cell | Content Cell |
|
|
|
-| Content Cell | Content Cell |
|
|
|
-| Content Cell | Content Cell |
|
|
|
-| Content Cell | Content Cell |
|
|
|
-| Content Cell | Content Cell |
|
|
|
-| Content Cell | Content Cell |
|
|
|
+| 1. GoogleAI introduces Muse, a new text-to-image generation model based on masked generative transformers; significantly more efficient than other diffusion models like Imagen and DALLE-2. | [Paper](https://arxiv.org/abs/2301.00704) [Project](https://muse-model.github.io/)|
|
|
|
+| 2. Microsoft introduces VALL-E, a text-to-audio model that performs state-of-the-art zero-shot performance; the text-to-speech synthesis task is treated as a conditional language modeling task: | [Project](https://valle-demo.github.io/) |
|
|
|
+| 3. A new paper shows the potential of enhancing LLMs by retrieving relevant external knowledge based on decomposed reasoning steps obtained through chain-of-thought prompting. | [Paper](https://arxiv.org/abs/2301.00303) |
|
|
|
+| 4. Presents a technique for compressing large language models while not sacrificing performance; "pruned to at least 50% sparsity in one-shot, without any retraining." | [Paper](https://arxiv.org/pdf/2301.00774.pdf) |
|
|
|
+| 5. ConvNeXt V2 is a performant model based on a fully convolutional masked autoencoder framework and other architectural improvements. CNNs are sticking back! | [Paper](https://arxiv.org/abs/2301.00808) |
|
|
|
+| 6. With more capabilities, we are starting to see a wider range of applications with LLMs. This paper utilized large language models for conducting corporate lobbying activities. | [Paper](https://arxiv.org/abs/2301.01181) |
|
|
|
+| 7. This work aims to better understand how deep learning models overfit or memorize examples; interesting phenomena observed; important work toward a mechanistic theory of memorization. | [Paper](https://transformer-circuits.pub/2023/toy-double-descent/index.html) |
|
|
|
+| 8. StitchNet: Interesting idea to create new coherent neural networks by reusing pretrained fragments of existing NNs. Not straightforward but there is potential in terms of efficiently reusing learned knowledge in pre-trained networks for complex tasks. | [Paper](https://arxiv.org/abs/2301.01947) |
|
|
|
+| 9. Proposes integrated decomposition, an approach to improve Science Q&A through a human-in-the-loop workflow for refining compositional LM programs. | [Paper](https://arxiv.org/abs/2301.01751) |
|
|
|
+| 10. A Succinct Summary of Reinforcement Learning. A nice little overview of some important ideas in RL. | [Content Cell](https://arxiv.org/abs/2301.01379) |
|