1 год назад · 76d0b6697d
--- a/getting-started/build_with_llama_4.ipynb
+++ b/getting-started/build_with_llama_4.ipynb
@@ -29,14 +29,14 @@
 
																     "\n",
															
 
																     "This notebook will jump right in and show you what's the latest with our models, how to use get the best out of them.\n",
															
 
																     "\n",
															
 
																-    "1. [Environment Setup](#env)\n",
															
 
																-    "2. [Loading the model](#load)\n",
															
 
																-    "3. [Long Context Demo](#longctx)\n",
															
 
																-    "4. [Text Conversations](#text)\n",
															
 
																-    "5. [Multilingual](#mling)\n",
															
 
																-    "6. [Multimodal: Single Image Understanding](#mm)\n",
															
 
																-    "7. [Multimodal: Multi Image Understanding](#mm2)\n",
															
 
																-    "8. [Function Calling with Image Understanding](#fc)"
															
 
																+    "1. Environment Setup\n",
															
 
																+    "2. Loading the model\n",
															
 
																+    "3. Long Context Demo\n",
															
 
																+    "4. Text Conversations\n",
															
 
																+    "5. Multilingual\n",
															
 
																+    "6. Multimodal: Single Image Understanding\n",
															
 
																+    "7. Multimodal: Multi Image Understanding\n",
															
 
																+    "8. Function Calling with Image Understanding"
															
 
																    ]
															
 
																   },
															
 
																   {
															
@@ -46,7 +46,6 @@
 
																     "jp-MarkdownHeadingCollapsed": true
															
 
																    },
															
 
																    "source": [
															
 
																-    "<a id='env'></a>\n",
															
 
																     "## Environment Setup:\n",
															
 
																     "\n",
															
 
																     "* You'll need at least 4 GPUs with >= 80GB each.\n",
															
@@ -72,7 +71,6 @@
 
																    "id": "2fcf2b8b-5274-4a85-bec9-03ef99b20ce9",
															
 
																    "metadata": {},
															
 
																    "source": [
															
 
																-    "<a id='longctx'></a>\n",
															
 
																     "## Long Context Demo: Write a guide on SAM-2 based on the repo\n",
															
 
																     "\n",
															
 
																     "Scout supports upto 10M context. On 8xH100, in bf16 you can get upto 1.4M tokens. We recommend using `vllm` for fast inference. \n",
															
@@ -991,7 +989,6 @@
 
																    "id": "17124706-e6b1-4e2a-b8a1-19e78243c5ac",
															
 
																    "metadata": {},
															
 
																    "source": [
															
 
																-    "<a id='text'></a>\n",
															
 
																     "## Text Conversations\n",
															
 
																     "\n",
															
 
																     "Llama 4 Scout continues to be a great conversationalist and can respond in various styles."
															
@@ -1074,7 +1071,6 @@
 
																    "id": "9c16037c-ea39-421d-b13b-853fa1db3858",
															
 
																    "metadata": {},
															
 
																    "source": [
															
 
																-    "<a id='mling'></a>\n",
															
 
																     "## Multilingual\n",
															
 
																     "\n",
															
 
																     "Llama 4 Scout is fluent in 12 languages: \n",
															
@@ -1135,7 +1131,6 @@
 
																    "id": "c4a5f841-aceb-43c6-9db7-b3f8e010a13b",
															
 
																    "metadata": {},
															
 
																    "source": [
															
 
																-    "<a id='mm'></a>\n",
															
 
																     "## Multimodal\n",
															
 
																     "Llama 4 Scout excels at image understanding. Note that the Llama models officially support only English for image-understanding.\n",
															
 
																     "\n",
															
@@ -1187,7 +1182,6 @@
 
																    "id": "6f058767-d415-4c8c-9019-387b0adacc8e",
															
 
																    "metadata": {},
															
 
																    "source": [
															
 
																-    "<a id='mm1'></a>\n",
															
 
																     "### Multimodal: Understanding a Single Image\n",
															
 
																     "\n",
															
 
																     "Here's an example with 1 image:"
															
@@ -1262,7 +1256,6 @@
 
																    "id": "df47b0d1-0cd9-4437-b8b2-7cefa1e189a7",
															
 
																    "metadata": {},
															
 
																    "source": [
															
 
																-    "<a id='mm2'></a>\n",
															
 
																     "### Multimodal: Understanding Multiple Images\n",
															
 
																     "\n",
															
 
																     "Llama 4 Scout can process information from multiple images - the number of images you can pass in a single request is only limited by the available memory. To prevent OOM errors, try downsizing the images before passing it to the model. "
															
@@ -1350,7 +1343,6 @@
 
																    "id": "b472898e-9ffa-429e-b64e-d31c0ebdd3a6",
															
 
																    "metadata": {},
															
 
																    "source": [
															
 
																-    "<a id='fc'></a>\n",
															
 
																     "## Function Calling with Image Understanding\n",
															
 
																     "\n",
															
 
																     "Function calling now works natively with images, i.e. the model can understand the images and return the appropriate function-call. In this example, we ask Llama to book us tickets to the place shown in the photos."