# torchtune recipe for Meta Llama 3 finetuning torchtune is a PyTorch-native library for easily authoring, fine-tuning and experimenting with LLMs. torchtune provides: - Native-PyTorch implementations of popular LLMs using composable and modular building blocks - Easy-to-use and hackable training recipes for popular fine-tuning techniques (LoRA, QLoRA) - no trainers, no frameworks, just PyTorch! - YAML configs for easily configuring training, evaluation, quantization or inference recipes - Built-in support for many popular dataset formats and prompt templates to help you quickly get started with training Here we will show the how to use torchtune to easily finetune meta Llama 3. For more usage details, please check the [torchtune repo](https://github.com/pytorch/torchtune/tree/main). ## Installation **Step 1:** [Install PyTorch](https://pytorch.org/get-started/locally/). torchtune is tested with the latest stable PyTorch release as well as the preview nightly version. **Step 2:** The latest stable version of torchtune is hosted on PyPI and can be downloaded with the following command: ```bash pip install torchtune ``` To confirm that the package is installed correctly, you can run the following command: ```bash tune --help ``` And should see the following output: ```bash usage: tune [-h] {ls,cp,download,run,validate} ... Welcome to the torchtune CLI! options: -h, --help show this help message and exit ... ``` You can also install the latest and greatest torchtune has to offer by [installing a nightly build](https://pytorch.org/torchtune/main/install.html). --- ## Meta Llama3 fine-tuning steps torchtune supports fine-tuning for the Llama3 8B and 70B size models. It currently support LoRA, QLoRA and full fine-tune on a single GPU as well as LoRA and full fine-tune on multiple devices for the 8B model, and LoRA on multiple devices for the 70B model. For all the details, take a look at the [tutorial](https://pytorch.org/torchtune/main/tutorials/llama3.html). **Note**: The Llama3 LoRA and QLoRA configs default to the instruct fine-tuned models. This is because not all special token embeddings are initialized in the base 8B and 70B models. In the initial experiments for Llama3-8B, QLoRA has a peak allocated memory of ``~9GB`` while LoRA on a single GPU has a peak allocated memory of ``~19GB``. To get started, you can use the default configs to kick off training. ### torchtune repo download We need to clone the torchtune repo to get the configs. ```bash git clone git@github.com:pytorch/torchtune.git cd torchtune/recipes/configs/ ``` ### Model download Select which model you want to download, in this example we want to use meta-llama/Meta-Llama-3-8B. ```bash tune download meta-llama/Meta-Llama-3-8B \ --output-dir /tmp/Meta-Llama-3-8B \ --hf-token \ ``` > Tip: Set your environment variable `HF_TOKEN` or pass in `--hf-token` to the command in order to validate your access. You can find your token at https://huggingface.co/settings/tokens ### Single GPU finetune LoRA 8B ```bash tune run lora_finetune_single_device --config llama3/8B_lora_single_device ``` QLoRA 8B ```bash tune run lora_finetune_single_device --config llama3/8B_qlora_single_device ``` Full 8B ```bash tune run full_finetune_single_device --config llama3/8B_full_single_device ``` ### Multi GPU finetune Full 8B ```bash tune run --nproc_per_node 4 full_finetune_distributed --config llama3/8B_full ``` LoRA 8B ```bash tune run --nproc_per_node 2 lora_finetune_distributed --config llama3/8B_lora ``` LoRA 70B Note that the download command for the Meta-Llama3 70B model slightly differs from download commands for the 8B models. This is because it use the HuggingFace [safetensor](https://huggingface.co/docs/safetensors/en/index) model format to load the model. To download the 70B model, run ```bash tune download meta-llama/Meta-Llama-3-70b --hf-token <> --output-dir /tmp/Meta-Llama-3-70b --ignore-patterns "original/consolidated*" ``` Then, a finetune can be kicked off: ```bash tune run --nproc_per_node 8 lora_finetune_distributed --config recipes/configs/llama3/70B_lora.yaml ``` You can find a full list of all the Llama3 configs [here](https://github.com/pytorch/torchtune/tree/main/recipes/configs/llama3).