1 vuosi sitten · 70c2adc742
--- a/getting-started/finetuning/README.md
+++ b/getting-started/finetuning/README.md
@@ -6,7 +6,7 @@ This folder contains instructions to fine-tune Meta Llama 3 on a
 
				 * [single-GPU setup](./singlegpu_finetuning.md)
			
 
				 * [multi-GPU setup](./multigpu_finetuning.md)
			
 
				 
			
 
				-using the canonical [finetuning script](../../src/llama_recipes/finetuning.py) in the llama-recipes package.
			
 
				+using the canonical [finetuning script](../../src/llama_cookbook/finetuning.py) in the llama-cookbook package.
			
 
				 
			
 
				 If you are new to fine-tuning techniques, check out [an overview](./LLM_finetuning_overview.md).
			
 
				 
			
@@ -17,10 +17,10 @@ If you are new to fine-tuning techniques, check out [an overview](./LLM_finetuni
 
				 ## How to configure finetuning settings?
			
 
				 
			
 
				 > [!TIP]
			
 
				-> All the setting defined in [config files](../../src/llama_recipes/configs/) can be passed as args through CLI when running the script, there is no need to change from config files directly.
			
 
				+> All the setting defined in [config files](../../src/llama_cookbook/configs/) can be passed as args through CLI when running the script, there is no need to change from config files directly.
			
 
				 
			
 
				 
			
 
				-* [Training config file](../../src/llama_recipes/configs/training.py) is the main config file that helps to specify the settings for our run and can be found in [configs folder](../../src/llama_recipes/configs/)
			
 
				+* [Training config file](../../src/llama_cookbook/configs/training.py) is the main config file that helps to specify the settings for our run and can be found in [configs folder](../../src/llama_cookbook/configs/)
			
 
				 
			
 
				 It lets us specify the training settings for everything from `model_name` to `dataset_name`, `batch_size` and so on. Below is the list of supported settings:
			
 
				 
			
@@ -71,11 +71,11 @@ It lets us specify the training settings for everything from `model_name` to `da
 
				 
			
 
				 ```
			
 
				 
			
 
				-* [Datasets config file](../../src/llama_recipes/configs/datasets.py) provides the available options for datasets.
			
 
				+* [Datasets config file](../../src/llama_cookbook/configs/datasets.py) provides the available options for datasets.
			
 
				 
			
 
				-* [peft config file](../../src/llama_recipes/configs/peft.py) provides the supported PEFT methods and respective settings that can be modified. We currently support LoRA and Llama-Adapter. Please note that LoRA is the only technique which is supported in combination with FSDP.
			
 
				+* [peft config file](../../src/llama_cookbook/configs/peft.py) provides the supported PEFT methods and respective settings that can be modified. We currently support LoRA and Llama-Adapter. Please note that LoRA is the only technique which is supported in combination with FSDP.
			
 
				 
			
 
				-* [FSDP config file](../../src/llama_recipes/configs/fsdp.py) provides FSDP settings such as:
			
 
				+* [FSDP config file](../../src/llama_cookbook/configs/fsdp.py) provides FSDP settings such as:
			
 
				 
			
 
				     * `mixed_precision` boolean flag to specify using mixed precision, defatults to true.
			
 
				 
			
@@ -102,7 +102,7 @@ It lets us specify the training settings for everything from `model_name` to `da
 
				 You can enable [W&B](https://wandb.ai/) experiment tracking by using `use_wandb` flag as below. You can change the project name, entity and other `wandb.init` arguments in `wandb_config`.
			
 
				 
			
 
				 ```bash
			
 
				-python -m llama_recipes.finetuning --use_peft --peft_method lora --quantization 8bit --model_name /path_of_model_folder/8B --output_dir Path/to/save/PEFT/model --use_wandb
			
 
				+python -m llama_cookbook.finetuning --use_peft --peft_method lora --quantization 8bit --model_name /path_of_model_folder/8B --output_dir Path/to/save/PEFT/model --use_wandb
			
 
				 ```
			
 
				 You'll be able to access a dedicated project or run link on [wandb.ai](https://wandb.ai) and see your dashboard like the one below.
			
 
				 <div style="display: flex;">
			
--- a/getting-started/finetuning/datasets/README.md
+++ b/getting-started/finetuning/datasets/README.md
--- a/getting-started/finetuning/multigpu_finetuning.md
+++ b/getting-started/finetuning/multigpu_finetuning.md
@@ -3,14 +3,14 @@ This recipe steps you through how to finetune a Meta Llama 3 model on the text s
 
				 
			
 
				 
			
 
				 ## Requirements
			
 
				-Ensure that you have installed the llama-recipes package ([details](../../README.md#installing)).
			
 
				+Ensure that you have installed the llama-cookbook package ([details](../../README.md#installing)).
			
 
				 
			
 
				 We will also need 2 packages:
			
 
				 1. [PEFT](https://github.com/huggingface/peft) to use parameter-efficient finetuning.
			
 
				 2. [FSDP](https://pytorch.org/tutorials/intermediate/FSDP_adavnced_tutorial.html) which helps us parallelize the training over multiple GPUs. [More details](./LLM_finetuning_overview.md#2-full-partial-parameter-finetuning).
			
 
				 
			
 
				 > [!NOTE]
			
 
				-> The llama-recipes package will install PyTorch 2.0.1 version. In case you want to use FSDP with PEFT for multi GPU finetuning, please install the PyTorch nightlies ([details](../../README.md#pytorch-nightlies))
			
 
				+> The llama-cookbook package will install PyTorch 2.0.1 version. In case you want to use FSDP with PEFT for multi GPU finetuning, please install the PyTorch nightlies ([details](../../README.md#pytorch-nightlies))
			
 
				 >
			
 
				 > INT8 quantization is not currently supported in FSDP
			
 
				 
			
@@ -96,14 +96,14 @@ srun  torchrun --nproc_per_node 8 --rdzv_id $RANDOM --rdzv_backend c10d --rdzv_e
 
				 Do not forget to adjust the number of nodes, ntasks and gpus-per-task in the top.
			
 
				 
			
 
				 ## Running with different datasets
			
 
				-Currently 3 open source datasets are supported that can be found in [Datasets config file](../../src/llama_recipes/configs/datasets.py). You can also use your custom dataset (more info [here](./datasets/README.md)).
			
 
				+Currently 3 open source datasets are supported that can be found in [Datasets config file](../../src/llama_cookbook/configs/datasets.py). You can also use your custom dataset (more info [here](./datasets/README.md)).
			
 
				 
			
 
				-* `grammar_dataset` : use this [notebook](../../src/llama_recipes/datasets/grammar_dataset/grammar_dataset_process.ipynb) to pull and process the Jfleg and C4 200M datasets for grammar checking.
			
 
				+* `grammar_dataset` : use this [notebook](../../src/llama_cookbook/datasets/grammar_dataset/grammar_dataset_process.ipynb) to pull and process the Jfleg and C4 200M datasets for grammar checking.
			
 
				 
			
 
				 * `alpaca_dataset` : to get this open source data please download the `aplaca.json` to `dataset` folder.
			
 
				 
			
 
				 ```bash
			
 
				-wget -P ../../src/llama_recipes/datasets https://raw.githubusercontent.com/tatsu-lab/stanford_alpaca/main/alpaca_data.json
			
 
				+wget -P ../../src/llama_cookbook/datasets https://raw.githubusercontent.com/tatsu-lab/stanford_alpaca/main/alpaca_data.json
			
 
				 ```
			
 
				 
			
 
				 * `samsum_dataset`
			
@@ -132,7 +132,7 @@ In case you are dealing with slower interconnect network between nodes, to reduc
 
				 
			
 
				 HSDP (Hybrid sharding Data Parallel) helps to define a hybrid sharding strategy where you can have FSDP within `sharding_group_size` which can be the minimum number of GPUs you can fit your model and DDP between the replicas of the model specified by `replica_group_size`.
			
 
				 
			
 
				-This will require to set the Sharding strategy in [fsdp config](../../src/llama_recipes/configs/fsdp.py) to `ShardingStrategy.HYBRID_SHARD` and specify two additional settings, `sharding_group_size` and `replica_group_size` where former specifies the sharding group size, number of GPUs that you model can fit into to form a replica of a model and latter specifies the replica group size, which is world_size/sharding_group_size.
			
 
				+This will require to set the Sharding strategy in [fsdp config](../../src/llama_cookbook/configs/fsdp.py) to `ShardingStrategy.HYBRID_SHARD` and specify two additional settings, `sharding_group_size` and `replica_group_size` where former specifies the sharding group size, number of GPUs that you model can fit into to form a replica of a model and latter specifies the replica group size, which is world_size/sharding_group_size.
			
 
				 
			
 
				 ```bash
			
 
				 
			
--- a/getting-started/finetuning/singlegpu_finetuning.md
+++ b/getting-started/finetuning/singlegpu_finetuning.md
@@ -1,12 +1,12 @@
 
				 # Fine-tuning with Single GPU
			
 
				 This recipe steps you through how to finetune a Meta Llama 3 model on the text summarization task using the [samsum](https://huggingface.co/datasets/samsum) dataset on a single GPU.
			
 
				 
			
 
				-These are the instructions for using the canonical [finetuning script](../../src/llama_recipes/finetuning.py) in the llama-recipes package.
			
 
				+These are the instructions for using the canonical [finetuning script](../../src/llama_cookbook/finetuning.py) in the llama-cookbook package.
			
 
				 
			
 
				 
			
 
				 ## Requirements
			
 
				 
			
 
				-Ensure that you have installed the llama-recipes package.
			
 
				+Ensure that you have installed the llama-cookbook package.
			
 
				 
			
 
				 To run fine-tuning on a single GPU, we will make use of two packages:
			
 
				 1. [PEFT](https://github.com/huggingface/peft) to use parameter-efficient finetuning.
			
@@ -33,15 +33,15 @@ The args used in the command above are:
 
				 
			
 
				 ### How to run with different datasets?
			
 
				 
			
 
				-Currently 3 open source datasets are supported that can be found in [Datasets config file](../../src/llama_recipes/configs/datasets.py). You can also use your custom dataset (more info [here](./datasets/README.md)).
			
 
				+Currently 3 open source datasets are supported that can be found in [Datasets config file](../../src/llama_cookbook/configs/datasets.py). You can also use your custom dataset (more info [here](./datasets/README.md)).
			
 
				 
			
 
				-* `grammar_dataset` : use this [notebook](../../src/llama_recipes/datasets/grammar_dataset/grammar_dataset_process.ipynb) to pull and process the Jfleg and C4 200M datasets for grammar checking.
			
 
				+* `grammar_dataset` : use this [notebook](../../src/llama_cookbook/datasets/grammar_dataset/grammar_dataset_process.ipynb) to pull and process the Jfleg and C4 200M datasets for grammar checking.
			
 
				 
			
 
				 * `alpaca_dataset` : to get this open source data please download the `alpaca.json` to `dataset` folder.
			
 
				 
			
 
				 
			
 
				 ```bash
			
 
				-wget -P ../../src/llama_recipes/datasets https://raw.githubusercontent.com/tatsu-lab/stanford_alpaca/main/alpaca_data.json
			
 
				+wget -P ../../src/llama_cookbook/datasets https://raw.githubusercontent.com/tatsu-lab/stanford_alpaca/main/alpaca_data.json
			
 
				 ```
			
 
				 
			
 
				 * `samsum_dataset`