|
@@ -62,7 +62,7 @@
|
|
| 2022-01 | Megatron-Turing NLG | Microsoft&NVIDIA | [Using Deep and Megatron to Train Megatron-Turing NLG 530B, A Large-Scale Generative Language Model](https://arxiv.org/pdf/2201.11990.pdf) |
|
|
| 2022-01 | Megatron-Turing NLG | Microsoft&NVIDIA | [Using Deep and Megatron to Train Megatron-Turing NLG 530B, A Large-Scale Generative Language Model](https://arxiv.org/pdf/2201.11990.pdf) |
|
|
| 2022-03 | InstructGPT | OpenAI | [Training language models to follow instructions with human feedback](https://arxiv.org/pdf/2203.02155.pdf) |
|
|
| 2022-03 | InstructGPT | OpenAI | [Training language models to follow instructions with human feedback](https://arxiv.org/pdf/2203.02155.pdf) |
|
|
| 2022-04 | PaLM | Google | [PaLM: Scaling Language Modeling with Pathways](https://arxiv.org/pdf/2204.02311.pdf) |
|
|
| 2022-04 | PaLM | Google | [PaLM: Scaling Language Modeling with Pathways](https://arxiv.org/pdf/2204.02311.pdf) |
|
|
-| 2022-04 | Chinchilla | DeepMind | [An empirical analysis of compute-optimal large language model training](https://arxiv.org/abs/2408.00724) |
|
|
|
|
|
|
+| 2022-04 | Chinchilla | DeepMind | [Training Compute-Optimal Large Language Models](https://arxiv.org/abs/2203.15556) |
|
|
| 2022-05 | OPT | Meta | [OPT: Open Pre-trained Transformer Language Models](https://arxiv.org/pdf/2205.01068.pdf) |
|
|
| 2022-05 | OPT | Meta | [OPT: Open Pre-trained Transformer Language Models](https://arxiv.org/pdf/2205.01068.pdf) |
|
|
| 2022-05 | UL2 | Google | [Unifying Language Learning Paradigms](https://arxiv.org/abs/2205.05131v1) |
|
|
| 2022-05 | UL2 | Google | [Unifying Language Learning Paradigms](https://arxiv.org/abs/2205.05131v1) |
|
|
| 2022-06 | Emergent Abilities | Google | [Emergent Abilities of Large Language Models](https://openreview.net/pdf?id=yzkSU5zdwD) |
|
|
| 2022-06 | Emergent Abilities | Google | [Emergent Abilities of Large Language Models](https://openreview.net/pdf?id=yzkSU5zdwD) |
|