1 år sedan · cb05f6e01a
--- a/recipes/quickstart/Prompt_Engineering_with_Llama_3.ipynb
+++ b/recipes/quickstart/Prompt_Engineering_with_Llama_3.ipynb
@@ -7,11 +7,11 @@
 
				    "source": [
			
 
				     "<a href=\"https://colab.research.google.com/github/meta-llama/llama-recipes/blob/main/recipes/quickstart/Prompt_Engineering_with_Llama_3.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>\n",
			
 
				     "\n",
			
 
				-    "# Prompt Engineering with Llama 3\n",
			
 
				+    "# Prompt Engineering with Llama 3.1\n",
			
 
				     "\n",
			
 
				     "Prompt engineering is using natural language to produce a desired response from a large language model (LLM).\n",
			
 
				     "\n",
			
 
				-    "This interactive guide covers prompt engineering & best practices with Llama 3."
			
 
				+    "This interactive guide covers prompt engineering & best practices with Llama 3.1."
			
 
				    ]
			
 
				   },
			
 
				   {
			
@@ -45,6 +45,15 @@
 
				     "\n",
			
 
				     "Llama models come in varying parameter sizes. The smaller models are cheaper to deploy and run; the larger models are more capable.\n",
			
 
				     "\n",
			
 
				+    "#### Llama 3.1\n",
			
 
				+    "1. `llama-3.1-8b` - base pretrained 8 billion parameter model\n",
			
 
				+    "1. `llama-3.1-70b` - base pretrained 70 billion parameter model\n",
			
 
				+    "1. `llama-3.1-405b` - base pretrained 405 billion parameter model\n",
			
 
				+    "1. `llama-3.1-8b-instruct` - instruction fine-tuned 8 billion parameter model\n",
			
 
				+    "1. `llama-3.1-70b-instruct` - instruction fine-tuned 70 billion parameter model\n",
			
 
				+    "1. `llama-3.1-405b-instruct` - instruction fine-tuned 405 billion parameter model (flagship)\n",
			
 
				+    "\n",
			
 
				+    "\n",
			
 
				     "#### Llama 3\n",
			
 
				     "1. `llama-3-8b` - base pretrained 8 billion parameter model\n",
			
 
				     "1. `llama-3-70b` - base pretrained 70 billion parameter model\n",
			
@@ -133,7 +142,7 @@
 
				     "\n",
			
 
				     "Tokens matter most when you consider API pricing and internal behavior (ex. hyperparameters).\n",
			
 
				     "\n",
			
 
				-    "Each model has a maximum context length that your prompt cannot exceed. That's 8K tokens for Llama 3, 4K for Llama 2, and 100K for Code Llama. \n"
			
 
				+    "Each model has a maximum context length that your prompt cannot exceed. That's 128k tokens for Llama 3.1, 4K for Llama 2, and 100K for Code Llama.\n"
			
 
				    ]
			
 
				   },
			
 
				   {
			
@@ -143,7 +152,7 @@
 
				    "source": [
			
 
				     "## Notebook Setup\n",
			
 
				     "\n",
			
 
				-    "The following APIs will be used to call LLMs throughout the guide. As an example, we'll call Llama 3 chat using [Grok](https://console.groq.com/playground?model=llama3-70b-8192).\n",
			
 
				+    "The following APIs will be used to call LLMs throughout the guide. As an example, we'll call Llama 3.1 chat using [Grok](https://console.groq.com/playground?model=llama3-70b-8192).\n",
			
 
				     "\n",
			
 
				     "To install prerequisites run:"
			
 
				    ]
			
@@ -171,8 +180,9 @@
 
				     "# Get a free API key from https://console.groq.com/keys\n",
			
 
				     "os.environ[\"GROQ_API_KEY\"] = \"YOUR_GROQ_API_KEY\"\n",
			
 
				     "\n",
			
 
				-    "LLAMA3_70B_INSTRUCT = \"llama3-70b-8192\"\n",
			
 
				-    "LLAMA3_8B_INSTRUCT = \"llama3-8b-8192\"\n",
			
 
				+    "LLAMA3_405B_INSTRUCT = \"llama-3.1-405b-reasoning\" # Note: Groq currently only gives access here to paying customers for 405B model\n",
			
 
				+    "LLAMA3_70B_INSTRUCT = \"llama-3.1-70b-versatile\"\n",
			
 
				+    "LLAMA3_8B_INSTRUCT = \"llama3.1-8b-instant\"\n",
			
 
				     "\n",
			
 
				     "DEFAULT_MODEL = LLAMA3_70B_INSTRUCT\n",
			
 
				     "\n",
			
@@ -225,7 +235,7 @@
 
				    "source": [
			
 
				     "### Completion APIs\n",
			
 
				     "\n",
			
 
				-    "Let's try Llama 3!"
			
 
				+    "Let's try Llama 3.1!"
			
 
				    ]
			
 
				   },
			
 
				   {
			
@@ -488,7 +498,7 @@
 
				     "\n",
			
 
				     "Simply adding a phrase encouraging step-by-step thinking \"significantly improves the ability of large language models to perform complex reasoning\" ([Wei et al. (2022)](https://arxiv.org/abs/2201.11903)). This technique is called \"CoT\" or \"Chain-of-Thought\" prompting.\n",
			
 
				     "\n",
			
 
				-    "Llama 3 now reasons step-by-step naturally without the addition of the phrase. This section remains for completeness."
			
 
				+    "Llama 3.1 now reasons step-by-step naturally without the addition of the phrase. This section remains for completeness."
			
 
				    ]
			
 
				   },
			
 
				   {
			
@@ -704,7 +714,7 @@
 
				    "source": [
			
 
				     "### Limiting Extraneous Tokens\n",
			
 
				     "\n",
			
 
				-    "A common struggle with Llama 2 is getting output without extraneous tokens (ex. \"Sure! Here's more information on...\"), even if explicit instructions are given to Llama 2 to be concise and no preamble. Llama 3 can better follow instructions.\n",
			
 
				+    "A common struggle with Llama 2 is getting output without extraneous tokens (ex. \"Sure! Here's more information on...\"), even if explicit instructions are given to Llama 2 to be concise and no preamble. Llama 3.x can better follow instructions.\n",
			
 
				     "\n",
			
 
				     "Check out this improvement that combines a role, rules and restrictions, explicit instructions, and an example:"
			
 
				    ]