🔥 Large Language Models(LLM) have taken the NLP community the Whole World by storm. Here is a comprehensive list of papers about large language models, especially relating to ChatGPT. It also contains codes, courses and related websites as shown below:
Is ChatGPT a General-Purpose Natural Language Processing Task Solver? Link
Is ChatGPT A Good Translator? A Preliminary Study Link
DeepSpeed is an easy-to-use deep learning optimization software suite that enables unprecedented scale and speed for DL Training and Inference. Visit us at deepspeed.ai or our Github repo.
Colossal-AI provides a collection of parallel components for you. We aim to support you to write your distributed deep learning models just like how you write your model on your laptop. We provide user-friendly tools to kickstart distributed training and inference in a few lines. You can visit it here.
[ICML 2022] Welcome to the "Big Model" Era: Techniques and Systems to Train and Serve Bigger Models Link
[NeurIPS 2022] Foundational Robustness of Foundation Models Link
[Andrej Karpathy] Let's build GPT: from scratch, in code, spelled out. Video|Code
[Stanford] CS224N-Lecture 11: Prompting, Instruction Finetuning, and RLHF Slides
[Stanford] CS324-Large Language Models Homepage
[Stanford] CS25-Transformers United V2 Homepage
[李沐] HELM全面语言模型评测 Bilibili
This is an active repository and your contributions are always welcome!
I will keep some pull requests open if I'm not sure if they are awesome for LLM, you could vote for them by adding 👍 to them.
If you have any question about this opinionated list, do not hesitate to contact me chengxin1998@stu.pku.edu.cn.