hace 1 año · 2d483aeb9d
--- a/3p-integrations/modal/many-llamas-human-eval/README.md
+++ b/3p-integrations/modal/many-llamas-human-eval/README.md
@@ -12,7 +12,7 @@ This experiment built by the team at [Modal](https://modal.com), and is describe
 
				 
			
 
				 [Beat GPT-4o at Python by searching with 100 small Llamas](https://modal.com/blog/llama-human-eval)
			
 
				 
			
 
				-The experiment has since been upgraded to use the [Llama 3.2 3B Instruct](https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct) model, and runnable end-to-end using the Modal serverless platform.
			
 
				+The experiment has since been upgraded to use the [Llama 3.2 3B Instruct](https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct) model, and run end-to-end using the Modal serverless platform.
			
 
				 
			
 
				 ## Run it yourself
			
 
				 
			
@@ -55,7 +55,7 @@ This will execute:
 
				 5. Generating graphs of pass@k and fail@k
			
 
				 
			
 
				 ### Results
			
 
				-
			
 
				+<!-- markdown-link-check-disable -->
			
 
				 The resulting plots of the evals will be saved locally to:
			
 
				 - `/tmp/plot-pass-k.jpeg`
			
 
				 - `/tmp/plot-fail-k.jpeg`
			
@@ -69,3 +69,4 @@ You'll see that at 100 generations, the Llama model is able to perform on-par wi
 
				 `/tmp/plot-fail-k.jpeg` shows fail@k across a log-scale, showing smooth scaling of this method.
			
 
				 
			
 
				 ![plot-fail-k](https://github.com/user-attachments/assets/7286e4ff-5090-4288-bd62-8a078c6dc5a1)
			
 
				+<!-- markdown-link-check-enable -->
			
--- a/README.md
+++ b/README.md
@@ -1,5 +1,4 @@
 
				 # Llama Cookbook: The Official Guide to building with Llama Models
			
 
				-<!-- markdown-link-check-disable -->
			
 
				 
			
 
				 > Note: We recently did a refactor of the repo, [archive-main](https://github.com/meta-llama/llama-recipes/tree/archive-main) is a snapshot branch from before the refactor
			
 
				 
			
@@ -18,7 +17,6 @@ The examples cover the most popular community approaches, popular use-cases and
 
				 > * [Multimodal Inference with Llama 3.2 Vision](./getting-started/inference/local_inference/README.md#multimodal-inference)
			
 
				 > * [Inference on Llama Guard 1B + Multimodal inference on Llama Guard 11B-Vision](./end-to-end-use-cases/responsible_ai/llama_guard/llama_guard_text_and_vision_inference.ipynb)
			
 
				 
			
 
				-<!-- markdown-link-check-enable -->
			
 
				 > [!NOTE]
			
 
				 > Llama 3.2 follows the same prompt template as Llama 3.1, with a new special token `<|image|>` representing the input image for the multimodal models.
			
 
				 > 
			
--- a/UPDATES.md
+++ b/UPDATES.md
@@ -1,5 +1,5 @@
 
				 DIFFLOG:
			
 
				-
			
 
				+<!-- markdown-link-check-disable -->
			
 
				 Nested Folders rename:
			
 
				 - /recipes/3p_integrations -> /3p-integrations
			
 
				 - /recipes/quickstart -> /getting-started
			
@@ -20,4 +20,4 @@ Removed folders:
 
				 - /flagged (Empty folder)
			
 
				 - /recipes/quickstart/Running_Llama3_Anywhere (Redundant code)
			
 
				 - /recipes/quickstart/inference/codellama (deprecated model)
			
 
				-
			
 
				+<!-- markdown-link-check-enable -->
			
--- a/end-to-end-use-cases/benchmarks/llm_eval_harness/meta_eval/README.md
+++ b/end-to-end-use-cases/benchmarks/llm_eval_harness/meta_eval/README.md
@@ -50,7 +50,7 @@ Given the extensive number of tasks available (12 for pretrained models and 30 f
 
				 - **Tasks for 3.2 pretrained models**: MMLU
			
 
				 - **Tasks for 3.2 instruct models**: MMLU, GPQA
			
 
				 
			
 
				-These tasks are common evalutions, many of which overlap with the Hugging Face [Open LLM Leaderboard v2](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
			
 
				+These tasks are common evaluations, many of which overlap with the Hugging Face [Open LLM Leaderboard v2](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
			
 
				 
			
 
				 Here, we aim to get the benchmark numbers on the aforementioned tasks using Hugging Face [leaderboard implementation](https://github.com/EleutherAI/lm-evaluation-harness/tree/main/lm_eval/tasks/leaderboard). Please follow the instructions below to make necessary modifications to use our eval prompts and get more eval metrics.
			
 
				 
			
--- a/end-to-end-use-cases/multilingual/README.md
+++ b/end-to-end-use-cases/multilingual/README.md
@@ -119,7 +119,7 @@ phase2_ds.save_to_disk("data/phase2")
 
				 ```
			
 
				 
			
 
				 ### Train
			
 
				-Finally, we can start finetuning Llama2 on these datasets by following the [finetuning recipes](../../quickstart/finetuning/). Remember to pass the new tokenizer path as an argument to the script: `--tokenizer_name=./extended_tokenizer`.
			
 
				+Finally, we can start finetuning Llama2 on these datasets by following the [finetuning recipes](../getting-started/finetuning/). Remember to pass the new tokenizer path as an argument to the script: `--tokenizer_name=./extended_tokenizer`.
			
 
				 
			
 
				 OpenHathi was trained on 64 A100 80GB GPUs. Here are the hyperparameters used and other training details:
			
 
				 - maximum learning rate: 2e-4