Suraj Subramanian 5b3aaa038c Fix broken image link 10 tháng trước cách đây
..
RAG 4344a420f2 recipes/quickstart folder updated 11 tháng trước cách đây
Running_Llama3_Anywhere c68410cbad typo fix 1 năm trước cách đây
agents cc569ef52b colab links fixed 10 tháng trước cách đây
finetuning 5b3aaa038c Fix broken image link 10 tháng trước cách đây
inference 808a3f7a0c Adding support for FSDP+Qlora. (#572) 10 tháng trước cách đây
Getting_to_know_Llama.ipynb b1939b10c9 replace groq llama 2 with replicate 1 năm trước cách đây
Prompt_Engineering_with_Llama_3.ipynb c12aab7030 Moving Prompt eng file to quickstart 11 tháng trước cách đây
README.md 4487513793 Updating the folder name 3p_integrations 11 tháng trước cách đây

README.md

Llama-Recipes Quickstart

If you are new to developing with Meta Llama models, this is where you should start. This folder contains introductory-level notebooks across different techniques relating to Meta Llama.

  • The [](./Running_Llama3_Anywhere/) notebooks demonstrate how to run Llama inference across Linux, Mac and Windows platforms using the appropriate tooling.
  • The [](./Prompt_Engineering_with_Llama_3.ipynb) notebook showcases the various ways to elicit appropriate outputs from Llama. Take this notebook for a spin to get a feel for how Llama responds to different inputs and generation parameters.
  • The [](./inference/) folder contains scripts to deploy Llama for inference on server and mobile. See also [](../3p_integrations/vllm/) and [](../3p_integrations/tgi/) for hosting Llama on open-source model servers.
  • The [](./RAG/) folder contains a simple Retrieval-Augmented Generation application using Llama 3.
  • The [](./finetuning/) folder contains resources to help you finetune Llama 3 on your custom datasets, for both single- and multi-GPU setups. The scripts use the native llama-recipes finetuning code found in [](../../src/llama_recipes/finetuning.py) which supports these features:

| Feature | | | ---------------------------------------------- | - | | HF support for finetuning | ✅ | | Deferred initialization ( meta init) | ✅ | | HF support for inference | ✅ | | Low CPU mode for multi GPU | ✅ | | Mixed precision | ✅ | | Single node quantization | ✅ | | Flash attention | ✅ | | PEFT | ✅ | | Activation checkpointing FSDP | ✅ | | Hybrid Sharded Data Parallel (HSDP) | ✅ | | Dataset packing & padding | ✅ | | BF16 Optimizer ( Pure BF16) | ✅ | | Profiling & MFU tracking | ✅ | | Gradient accumulation | ✅ | | CPU offloading | ✅ | | FSDP checkpoint conversion to HF for inference | ✅ | | W&B experiment tracker | ✅ |