Browse Source

* Add new readmes
* Move prompt-engineering notebook to folder
* Delete unnecessary samsum file

Suraj Subramanian 11 months ago
parent
commit
b273a75a97

+ 3 - 3
recipes/inference/local_inference/README.md

@@ -61,7 +61,7 @@ python inference.py --model_name <training_config.output_dir> --peft_model <trai
 
 
 ```
 ```
 
 
-## Loading back FSDP checkpoints
+## Inference with FSDP checkpoints
 
 
 In case you have fine-tuned your model with pure FSDP and saved the checkpoints with "SHARDED_STATE_DICT" as shown [here](../../../src/llama_recipes/configs/fsdp.py), you can use this converter script to convert the FSDP Sharded checkpoints into HuggingFace checkpoints. This enables you to use the inference script normally as mentioned above.
 In case you have fine-tuned your model with pure FSDP and saved the checkpoints with "SHARDED_STATE_DICT" as shown [here](../../../src/llama_recipes/configs/fsdp.py), you can use this converter script to convert the FSDP Sharded checkpoints into HuggingFace checkpoints. This enables you to use the inference script normally as mentioned above.
 **To convert the checkpoint use the following command**:
 **To convert the checkpoint use the following command**:
@@ -82,6 +82,6 @@ By default, training parameter are saved in `train_params.yaml` in the path wher
 Then run inference using:
 Then run inference using:
 
 
 ```bash
 ```bash
-python inference.py --model_name <training_config.output_dir> --prompt_file <test_prompt_file> 
+python inference.py --model_name <training_config.output_dir> --prompt_file <test_prompt_file>
 
 
-```
+```

+ 28 - 0
recipes/quickstart/README.md

@@ -0,0 +1,28 @@
+## Llama-Recipes Quickstart
+
+If you are new to developing with Meta Llama models, this is where you should start. This folder contains introductory-level notebooks across different techniques relating to Meta Llama.
+
+* The [](./Running_Llama3_Anywhere/) notebooks demonstrate how to run Llama inference across Linux, Mac and Windows platforms using the appropriate tooling.
+* The [](./prompt_engineering/Prompt_Engineering_with_Llama_3.ipynb) notebook showcases the various ways to elicit appropriate outputs from Llama. Take this notebook for a spin to get a feel for how Llama responds to different inputs and generation parameters.
+* The [](./inference/) folder contains scripts to deploy Llama for inference on server and mobile. See also [](../3p_integrations/vllm/) and [](../3p_integrations/tgi/) for hosting Llama on open-source model servers.
+* The [](./RAG/) folder contains a simple Retrieval-Augmented Generation application using Llama 3.
+* The [](./finetuning/) folder contains resources to help you finetune Llama 3 on your custom datasets, for both single- and multi-GPU setups. The scripts use the native llama-recipes finetuning code found in [](../../src/llama_recipes/finetuning.py) which supports these features:
+
+| Feature                                        |   |
+| ---------------------------------------------- | - |
+| HF support for finetuning                      | ✅ |
+| Deferred initialization ( meta init)           | ✅ |
+| HF support for inference                       | ✅ |
+| Low CPU mode for multi GPU                     | ✅ |
+| Mixed precision                                | ✅ |
+| Single node quantization                       | ✅ |
+| Flash attention                                | ✅ |
+| PEFT                                           | ✅ |
+| Activation checkpointing FSDP                  | ✅ |
+| Hybrid Sharded Data Parallel (HSDP)            | ✅ |
+| Dataset packing & padding                      | ✅ |
+| BF16 Optimizer ( Pure BF16)                    | ✅ |
+| Gradient accumulation                          | ✅ |
+| CPU offloading                                 | ✅ |
+| FSDP checkpoint conversion to HF for inference | ✅ |
+| W&B experiment tracker                         | ✅ |

+ 7 - 0
recipes/quickstart/inference/README.md

@@ -0,0 +1,7 @@
+## Quickstart > Inference
+
+This folder contains scripts to get you started with inference on Meta Llama models.
+
+* [](./code_llama/) contains scripts for tasks relating to code generation using CodeLlama
+* [](./local_inference/) contsin scripts to do memory efficient inference on servers and local machines
+* [](./mobile_inference/) has scripts using MLC to serve Llama on Android (h/t to OctoAI for the contribution!)

recipes/quickstart/Prompt_Engineering_with_Llama_3.ipynb → recipes/quickstart/prompt_engineering/Prompt_Engineering_with_Llama_3.ipynb