hace 1 año · 8dc0578702
--- a/README.md
+++ b/README.md
@@ -136,14 +136,9 @@ Contains examples are organized in folders by topic:
 
				 | Subfolder | Description |
			
 
				 |---|---|
			
 
				 [quickstart](./recipes/quickstart) | The "Hello World" of using Llama, start here if you are new to using Llama.
			
 
				-[finetuning](./recipes/finetuning)|Scripts to finetune Llama on single-GPU and multi-GPU setups
			
 
				-[inference](./recipes/inference)|Scripts to deploy Llama for inference locally and using model servers
			
 
				 [use_cases](./recipes/use_cases)|Scripts showing common applications of Meta Llama3
			
 
				+[3p_integration](./recipes/3p_integration)|Partner owned folder showing common applications of Meta Llama3
			
 
				 [responsible_ai](./recipes/responsible_ai)|Scripts to use PurpleLlama for safeguarding model outputs
			
 
				-[llama_api_providers](./recipes/llama_api_providers)|Scripts to run inference on Llama via hosted endpoints
			
 
				-[benchmarks](./recipes/benchmarks)|Scripts to benchmark Llama models inference on various backends
			
 
				-[code_llama](./recipes/code_llama)|Scripts to run inference with the Code Llama models
			
 
				-[evaluation](./recipes/evaluation)|Scripts to evaluate fine-tuned Llama models using `lm-evaluation-harness` from `EleutherAI`
			
 
				 
			
 
				 ### `src/`
			
 
				 
			
--- a/recipes/3p_integrations/README.md
+++ b/recipes/3p_integrations/README.md
--- a/recipes/3p_integrations/hf_text_generation_inference/README.md
+++ b/recipes/3p_integrations/hf_text_generation_inference/README.md
--- a/recipes/3p_integrations/hf_text_generation_inference/merge_lora_weights.py
+++ b/recipes/3p_integrations/hf_text_generation_inference/merge_lora_weights.py
--- a/recipes/3p_integrations/lamini/text2sql_memory_tuning/README.md
+++ b/recipes/3p_integrations/lamini/text2sql_memory_tuning/README.md
--- a/recipes/3p_integrations/lamini/text2sql_memory_tuning/assets/manual_filtering.png
+++ b/recipes/3p_integrations/lamini/text2sql_memory_tuning/assets/manual_filtering.png
--- a/recipes/3p_integrations/lamini/text2sql_memory_tuning/assets/website.png
+++ b/recipes/3p_integrations/lamini/text2sql_memory_tuning/assets/website.png
--- a/recipes/3p_integrations/lamini/text2sql_memory_tuning/data/gold-test-set-v2.jsonl
+++ b/recipes/3p_integrations/lamini/text2sql_memory_tuning/data/gold-test-set-v2.jsonl
--- a/recipes/3p_integrations/lamini/text2sql_memory_tuning/data/gold-test-set.jsonl
+++ b/recipes/3p_integrations/lamini/text2sql_memory_tuning/data/gold-test-set.jsonl
--- a/recipes/3p_integrations/lamini/text2sql_memory_tuning/data/training_data/archive/generated_queries_large_filtered_cleaned.jsonl
+++ b/recipes/3p_integrations/lamini/text2sql_memory_tuning/data/training_data/archive/generated_queries_large_filtered_cleaned.jsonl
--- a/recipes/3p_integrations/lamini/text2sql_memory_tuning/data/training_data/archive/generated_queries_v2_large_filtered_cleaned.jsonl
+++ b/recipes/3p_integrations/lamini/text2sql_memory_tuning/data/training_data/archive/generated_queries_v2_large_filtered_cleaned.jsonl
--- a/recipes/3p_integrations/lamini/text2sql_memory_tuning/data/training_data/generated_queries.jsonl
+++ b/recipes/3p_integrations/lamini/text2sql_memory_tuning/data/training_data/generated_queries.jsonl
--- a/recipes/3p_integrations/lamini/text2sql_memory_tuning/data/training_data/generated_queries_large.jsonl
+++ b/recipes/3p_integrations/lamini/text2sql_memory_tuning/data/training_data/generated_queries_large.jsonl
--- a/recipes/3p_integrations/lamini/text2sql_memory_tuning/data/training_data/generated_queries_large_filtered.jsonl
+++ b/recipes/3p_integrations/lamini/text2sql_memory_tuning/data/training_data/generated_queries_large_filtered.jsonl
--- a/recipes/3p_integrations/lamini/text2sql_memory_tuning/data/training_data/generated_queries_v2.jsonl
+++ b/recipes/3p_integrations/lamini/text2sql_memory_tuning/data/training_data/generated_queries_v2.jsonl
--- a/recipes/3p_integrations/lamini/text2sql_memory_tuning/data/training_data/generated_queries_v2_large.jsonl
+++ b/recipes/3p_integrations/lamini/text2sql_memory_tuning/data/training_data/generated_queries_v2_large.jsonl
--- a/recipes/3p_integrations/lamini/text2sql_memory_tuning/data/training_data/generated_queries_v2_large_filtered.jsonl
+++ b/recipes/3p_integrations/lamini/text2sql_memory_tuning/data/training_data/generated_queries_v2_large_filtered.jsonl
--- a/recipes/3p_integrations/lamini/text2sql_memory_tuning/meta-lamini.ipynb
+++ b/recipes/3p_integrations/lamini/text2sql_memory_tuning/meta-lamini.ipynb
--- a/recipes/3p_integrations/lamini/text2sql_memory_tuning/nba_roster.db
+++ b/recipes/3p_integrations/lamini/text2sql_memory_tuning/nba_roster.db
--- a/recipes/3p_integrations/lamini/text2sql_memory_tuning/util/get_default_finetune_args.py
+++ b/recipes/3p_integrations/lamini/text2sql_memory_tuning/util/get_default_finetune_args.py
--- a/recipes/3p_integrations/lamini/text2sql_memory_tuning/util/get_rubric.py
+++ b/recipes/3p_integrations/lamini/text2sql_memory_tuning/util/get_rubric.py
--- a/recipes/3p_integrations/lamini/text2sql_memory_tuning/util/get_schema.py
+++ b/recipes/3p_integrations/lamini/text2sql_memory_tuning/util/get_schema.py
--- a/recipes/3p_integrations/lamini/text2sql_memory_tuning/util/load_dataset.py
+++ b/recipes/3p_integrations/lamini/text2sql_memory_tuning/util/load_dataset.py
--- a/recipes/3p_integrations/lamini/text2sql_memory_tuning/util/make_llama_3_prompt.py
+++ b/recipes/3p_integrations/lamini/text2sql_memory_tuning/util/make_llama_3_prompt.py
--- a/recipes/3p_integrations/lamini/text2sql_memory_tuning/util/parse_arguments.py
+++ b/recipes/3p_integrations/lamini/text2sql_memory_tuning/util/parse_arguments.py
--- a/recipes/3p_integrations/lamini/text2sql_memory_tuning/util/setup_logging.py
+++ b/recipes/3p_integrations/lamini/text2sql_memory_tuning/util/setup_logging.py
--- a/recipes/3p_integrations/llama-on-prem.md
+++ b/recipes/3p_integrations/llama-on-prem.md
--- a/recipes/3p_integrations/vllm/inference.py
+++ b/recipes/3p_integrations/vllm/inference.py
--- a/recipes/README.md
+++ b/recipes/README.md
@@ -3,11 +3,6 @@ This folder contains examples organized by topic:
 
				 | Subfolder | Description |
			
 
				 |---|---|
			
 
				 [quickstart](./quickstart)|The "Hello World" of using Llama 3, start here if you are new to using Llama 3
			
 
				-[multilingual](./multilingual)|Scripts to add a new language to Llama
			
 
				-[finetuning](./quickstart/finetuning)|Scripts to finetune Llama 3 on single-GPU and multi-GPU setups
			
 
				-[inference](./quickstart/inference)|Scripts to deploy Llama 3 for inference [locally](./quickstart/inference/local_inference/), on mobile [Android](./quickstart/inference/mobile_inference/android_inference/) and using [model servers](./quickstart/inference/mobile_inference/)
			
 
				 [use_cases](./use_cases)|Scripts showing common applications of Llama 3
			
 
				+[3p_integration](./3p_integration)|Partner owned folder showing common applications of Meta Llama3
			
 
				 [responsible_ai](./responsible_ai)|Scripts to use PurpleLlama for safeguarding model outputs
			
 
				-[llama_api_providers](./llama_api_providers)|Scripts to run inference on Llama via hosted endpoints
			
 
				-[benchmarks](./benchmarks)|Scripts to benchmark Llama 3 models inference on various backends
			
 
				-[code_llama](./code_llama)|Scripts to run inference with the Code Llama models
			
--- a/recipes/quickstart/finetuning/README.md
+++ b/recipes/quickstart/finetuning/README.md
@@ -6,7 +6,7 @@ This folder contains instructions to fine-tune Meta Llama 3 on a
 
				 * [single-GPU setup](./singlegpu_finetuning.md)
			
 
				 * [multi-GPU setup](./multigpu_finetuning.md)
			
 
				 
			
 
				-using the canonical [finetuning script](../../src/llama_recipes/finetuning.py) in the llama-recipes package.
			
 
				+using the canonical [finetuning script](../../../src/llama_recipes/finetuning.py) in the llama-recipes package.
			
 
				 
			
 
				 If you are new to fine-tuning techniques, check out an overview: [](./LLM_finetuning_overview.md)
			
 
				 
			
@@ -17,10 +17,10 @@ If you are new to fine-tuning techniques, check out an overview: [](./LLM_finetu
 
				 ## How to configure finetuning settings?
			
 
				 
			
 
				 > [!TIP]
			
 
				-> All the setting defined in [config files](../../src/llama_recipes/configs/) can be passed as args through CLI when running the script, there is no need to change from config files directly.
			
 
				+> All the setting defined in [config files](../../../src/llama_recipes/configs/) can be passed as args through CLI when running the script, there is no need to change from config files directly.
			
 
				 
			
 
				 
			
 
				-* [Training config file](../../src/llama_recipes/configs/training.py) is the main config file that helps to specify the settings for our run and can be found in [configs folder](../../src/llama_recipes/configs/)
			
 
				+* [Training config file](../../../src/llama_recipes/configs/training.py) is the main config file that helps to specify the settings for our run and can be found in [configs folder](../../../src/llama_recipes/configs/)
			
 
				 
			
 
				 It lets us specify the training settings for everything from `model_name` to `dataset_name`, `batch_size` and so on. Below is the list of supported settings:
			
 
				 
			
@@ -70,11 +70,11 @@ It lets us specify the training settings for everything from `model_name` to `da
 
				 
			
 
				 ```
			
 
				 
			
 
				-* [Datasets config file](../../src/llama_recipes/configs/datasets.py) provides the available options for datasets.
			
 
				+* [Datasets config file](../../../src/llama_recipes/configs/datasets.py) provides the available options for datasets.
			
 
				 
			
 
				-* [peft config file](../../src/llama_recipes/configs/peft.py) provides the supported PEFT methods and respective settings that can be modified. We currently support LoRA and Llama-Adapter. Please note that LoRA is the only technique which is supported in combination with FSDP.
			
 
				+* [peft config file](../../../src/llama_recipes/configs/peft.py) provides the supported PEFT methods and respective settings that can be modified. We currently support LoRA and Llama-Adapter. Please note that LoRA is the only technique which is supported in combination with FSDP.
			
 
				 
			
 
				-* [FSDP config file](../../src/llama_recipes/configs/fsdp.py) provides FSDP settings such as:
			
 
				+* [FSDP config file](../../../src/llama_recipes/configs/fsdp.py) provides FSDP settings such as:
			
 
				 
			
 
				     * `mixed_precision` boolean flag to specify using mixed precision, defatults to true.
			
 
				 
			
@@ -105,7 +105,7 @@ python -m llama_recipes.finetuning --use_peft --peft_method lora --quantization
 
				 ```
			
 
				 You'll be able to access a dedicated project or run link on [wandb.ai](https://wandb.ai) and see your dashboard like the one below.
			
 
				 <div style="display: flex;">
			
 
				-    <img src="../../docs/images/wandb_screenshot.png" alt="wandb screenshot" width="500" />
			
 
				+    <img src="../../../docs/img/wandb_screenshot.png" alt="wandb screenshot" width="500" />
			
 
				 </div>
			
 
				 
			
 
				 ## FLOPS Counting and Pytorch Profiling
			
--- a/recipes/quickstart/finetuning/datasets/README.md
+++ b/recipes/quickstart/finetuning/datasets/README.md
@@ -48,17 +48,17 @@ python -m llama_recipes.finetuning --dataset "custom_dataset" --custom_dataset.f
 
				 This will call the function `get_foo` instead of `get_custom_dataset` when retrieving the dataset.
			
 
				 
			
 
				 ### Adding new dataset
			
 
				-Each dataset has a corresponding configuration (dataclass) in [configs/datasets.py](../../../src/llama_recipes/configs/datasets.py) which contains the dataset name, training/validation split names, as well as optional parameters like datafiles etc.
			
 
				+Each dataset has a corresponding configuration (dataclass) in [configs/datasets.py](../../../../src/llama_recipes/configs/datasets.py) which contains the dataset name, training/validation split names, as well as optional parameters like datafiles etc.
			
 
				 
			
 
				-Additionally, there is a preprocessing function for each dataset in the [datasets](../../../src/llama_recipes/datasets) folder.
			
 
				+Additionally, there is a preprocessing function for each dataset in the [datasets](../../../../src/llama_recipes/datasets) folder.
			
 
				 The returned data of the dataset needs to be consumable by the forward method of the fine-tuned model by calling ```model(**data)```.
			
 
				 For CausalLM models this usually means that the data needs to be in the form of a dictionary with "input_ids", "attention_mask" and "labels" fields.
			
 
				 
			
 
				 To add a custom dataset the following steps need to be performed.
			
 
				 
			
 
				-1. Create a dataset configuration after the schema described above. Examples can be found in [configs/datasets.py](../../../src/llama_recipes/configs/datasets.py).
			
 
				+1. Create a dataset configuration after the schema described above. Examples can be found in [configs/datasets.py](../../../../src/llama_recipes/configs/datasets.py).
			
 
				 2. Create a preprocessing routine which loads the data and returns a PyTorch style dataset. The signature for the preprocessing function needs to be (dataset_config, tokenizer, split_name) where split_name will be the string for train/validation split as defined in the dataclass.
			
 
				-3. Register the dataset name and preprocessing function by inserting it as key and value into the DATASET_PREPROC dictionary in [utils/dataset_utils.py](../../../src/llama_recipes/utils/dataset_utils.py)
			
 
				+3. Register the dataset name and preprocessing function by inserting it as key and value into the DATASET_PREPROC dictionary in [utils/dataset_utils.py](../../../../src/llama_recipes/utils/dataset_utils.py)
			
 
				 4. Set dataset field in training config to dataset name or use --dataset option of the `llama_recipes.finetuning` module or examples/finetuning.py training script.
			
 
				 
			
 
				 ## Application
			
--- a/recipes/quickstart/finetuning/multigpu_finetuning.md
+++ b/recipes/quickstart/finetuning/multigpu_finetuning.md
--- a/recipes/quickstart/finetuning/singlegpu_finetuning.md
+++ b/recipes/quickstart/finetuning/singlegpu_finetuning.md
@@ -1,12 +1,12 @@
 
				 # Fine-tuning with Single GPU
			
 
				 This recipe steps you through how to finetune a Meta Llama 3 model on the text summarization task using the [samsum](https://huggingface.co/datasets/samsum) dataset on a single GPU.
			
 
				 
			
 
				-These are the instructions for using the canonical [finetuning script](../../src/llama_recipes/finetuning.py) in the llama-recipes package.
			
 
				+These are the instructions for using the canonical [finetuning script](../../../src/llama_recipes/finetuning.py) in the llama-recipes package.
			
 
				 
			
 
				 
			
 
				 ## Requirements
			
 
				 
			
 
				-Ensure that you have installed the llama-recipes package ([details](../../README.md#installing)).
			
 
				+Ensure that you have installed the llama-recipes package ([details](../../../README.md#installing)).
			
 
				 
			
 
				 To run fine-tuning on a single GPU, we will make use of two packages:
			
 
				 1. [PEFT](https://github.com/huggingface/peft) to use parameter-efficient finetuning.
			
@@ -30,15 +30,15 @@ The args used in the command above are:
 
				 
			
 
				 ### How to run with different datasets?
			
 
				 
			
 
				-Currently 3 open source datasets are supported that can be found in [Datasets config file](../../src/llama_recipes/configs/datasets.py). You can also use your custom dataset (more info [here](./datasets/README.md)).
			
 
				+Currently 3 open source datasets are supported that can be found in [Datasets config file](../../../src/llama_recipes/configs/datasets.py). You can also use your custom dataset (more info [here](./datasets/README.md)).
			
 
				 
			
 
				-* `grammar_dataset` : use this [notebook](../../src/llama_recipes/datasets/grammar_dataset/grammar_dataset_process.ipynb) to pull and process the Jfleg and C4 200M datasets for grammar checking.
			
 
				+* `grammar_dataset` : use this [notebook](../../../src/llama_recipes/datasets/grammar_dataset/grammar_dataset_process.ipynb) to pull and process the Jfleg and C4 200M datasets for grammar checking.
			
 
				 
			
 
				 * `alpaca_dataset` : to get this open source data please download the `alpaca.json` to `dataset` folder.
			
 
				 
			
 
				 
			
 
				 ```bash
			
 
				-wget -P ../../src/llama_recipes/datasets https://raw.githubusercontent.com/tatsu-lab/stanford_alpaca/main/alpaca_data.json
			
 
				+wget -P ../../../src/llama_recipes/datasets https://raw.githubusercontent.com/tatsu-lab/stanford_alpaca/main/alpaca_data.json
			
 
				 ```
			
 
				 
			
 
				 * `samsum_dataset`
			
--- a/recipes/quickstart/inference/local_inference/README.md
+++ b/recipes/quickstart/inference/local_inference/README.md
@@ -63,7 +63,7 @@ python inference.py --model_name <training_config.output_dir> --peft_model <trai
 
				 
			
 
				 ## Loading back FSDP checkpoints
			
 
				 
			
 
				-In case you have fine-tuned your model with pure FSDP and saved the checkpoints with "SHARDED_STATE_DICT" as shown [here](../../../src/llama_recipes/configs/fsdp.py), you can use this converter script to convert the FSDP Sharded checkpoints into HuggingFace checkpoints. This enables you to use the inference script normally as mentioned above.
			
 
				+In case you have fine-tuned your model with pure FSDP and saved the checkpoints with "SHARDED_STATE_DICT" as shown [here](../../../../src/llama_recipes/configs/fsdp.py), you can use this converter script to convert the FSDP Sharded checkpoints into HuggingFace checkpoints. This enables you to use the inference script normally as mentioned above.
			
 
				 **To convert the checkpoint use the following command**:
			
 
				 
			
 
				 This is helpful if you have fine-tuned you model using FSDP only as follows:
			
--- a/recipes/quickstart/inference/mobile_inference/android_inference/README.md
+++ b/recipes/quickstart/inference/mobile_inference/android_inference/README.md
@@ -9,7 +9,7 @@ Machine Learning Compilation for Large Language Models (MLC LLM) is a high-perfo
 
				 
			
 
				 You can read more about MLC-LLM at the following [link](https://github.com/mlc-ai/mlc-llm).
			
 
				 
			
 
				-MLC-LLM is also what powers the Llama3 inference APIs provided by [OctoAI](https://octo.ai/). You can use OctoAI for your Llama3 cloud-based inference needs by trying out the examples under the [following path](../../../llama_api_providers/OctoAI_API_examples/).
			
 
				+MLC-LLM is also what powers the Llama3 inference APIs provided by [OctoAI](https://octo.ai/). You can use OctoAI for your Llama3 cloud-based inference needs by trying out the examples under the [following path](../../../../llama_api_providers/OctoAI_API_examples/).
			
 
				 
			
 
				 This tutorial was tested with the following setup:
			
 
				 * MacBook Pro 16 inch from 2021 with Apple M1 Max and 32GB of RAM running Sonoma 14.3.1
			
@@ -144,4 +144,4 @@ The MLCChat app will launch on your phone, now access your phone:
 
				 
			
 
				 Note that you can change the build settings to bundle the weights with the MLCChat app so you don't have to download the weights over wifi. To do so you can follow the instructions [here](https://llm.mlc.ai/docs/deploy/android.html#bundle-model-weights).
			
 
				 
			
 
				-Once the model weights are downloaded you can now interact with Llama 3 locally on your Android phone!
			
 
				+Once the model weights are downloaded you can now interact with Llama 3 locally on your Android phone!