Переглянути джерело

Add Tool Calling notebooks

Sanyam Bhutani 6 місяців тому
батько
коміт
4223d169ac

+ 700 - 0
recipes/quickstart/inference/tool_calling/Tool_Calling_101.ipynb

@@ -0,0 +1,700 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Tool Calling 101:\n",
+    "\n",
+    "This is part (1/2) in the tool calling series, this notebook will cover the basics of what tool calling is and how to perform it with `Llama 3.1 models`\n",
+    "\n",
+    "Here's what you will learn in this notebook:\n",
+    "\n",
+    "- Setup Groq to access Llama 3.1 70B model\n",
+    "- Avoid common mistakes when performing tool-calling with Llama\n",
+    "- Understand Prompt templates for Tool Calling\n",
+    "- Understand how the tool calls are handled under the hood\n",
+    "\n",
+    "In Part 2, we will learn how to build system that can get us comparision between 2 papers"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## What is Tool Calling?\n",
+    "\n",
+    "This approach was popularised by the [Gorilla](https://gorilla.cs.berkeley.edu) paper-which showed that Large Language Model(s) can be fine-tuned on API examples to teach them calling an external API. \n",
+    "\n",
+    "This is really cool because we can now use a LLM as a \"brain\" of a system and connect it to external systems to perform actions. \n",
+    "\n",
+    "In simpler words, \"Llama can order your pizza for you\" :) \n",
+    "\n",
+    "With the Llama 3.1 release, the models excel at tool calling and support out of box `brave_search`, `wolfram_api` and `code_interpreter`. \n",
+    "\n",
+    "However, first let's take a look at a common mistake"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### Install and setup groq dependencies\n",
+    "\n",
+    "- Install `groq` api to access Llama model(s)\n",
+    "- Configure our client and authenticate with API Key(s)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "#!pip3 install groq"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 82,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import os\n",
+    "from groq import Groq\n",
+    "# Create the Groq client\n",
+    "client = Groq(api_key=os.environ.get(\"GROQ_API_KEY\"), )"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Common Mistake of Tool-Calling: Incorrect Prompt Template\n",
+    "\n",
+    "While Llama 3.1 works with tool-calling out of box, a wrong prompt template can cause issues with unexpected behaviour. \n",
+    "\n",
+    "Sometimes, even superheroes need to be reminded of their powers. \n",
+    "\n",
+    "Let's first try \"forcing a prompt response from the model\""
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### Note: Remember this is the WRONG template, please scroll to next section to see the right approach if you are in a rushed copy-pasta sprint\n",
+    "\n",
+    "This section will show you that the model will not use `brave_search` and `wolfram_api` out of the box unless the prompt template is set correctly. \n",
+    "Even if the model is asked to do so!"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 83,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "SYSTEM_PROMPT = \"\"\"\n",
+    "Cutting Knowledge Date: December 2023\n",
+    "Today Date: 20 August 2024\n",
+    "\n",
+    "You are a helpful assistant\n",
+    "\"\"\""
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 84,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "system_prompt = {}\n",
+    "chat_history = []\n",
+    "\n",
+    "def model_chat(user_input: str, sys_prompt = SYSTEM_PROMPT, temperature: int = 0.7, max_tokens=2048):\n",
+    "    \n",
+    "    chat_history = [\n",
+    "        {\n",
+    "            \"role\": \"system\",\n",
+    "            \"content\": sys_prompt\n",
+    "        }\n",
+    "    ]\n",
+    "    \n",
+    "    chat_history.append({\"role\": \"user\", \"content\": user_input})\n",
+    "    \n",
+    "    #print(chat_history)\n",
+    "    \n",
+    "    #print(\"User: \", user_input)\n",
+    "    \n",
+    "    response = client.chat.completions.create(model=\"llama-3.1-70b-versatile\",\n",
+    "                                          messages=chat_history,\n",
+    "                                          max_tokens=max_tokens,\n",
+    "                                          temperature=temperature)\n",
+    "    \n",
+    "    chat_history.append({\n",
+    "    \"role\": \"assistant\",\n",
+    "    \"content\": response.choices[0].message.content\n",
+    "    })\n",
+    "    \n",
+    "    \n",
+    "    #print(\"Assistant:\", response.choices[0].message.content)\n",
+    "    \n",
+    "    return response.choices[0].message.content"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### Asking the model about a recent news\n",
+    "\n",
+    "Since the prompt template is incorrect, it will answer using cutoff memory"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 85,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Assistant: Unfortunately, I don't have information on a specific release date for the next Elden Ring game. However, I can tell you that there have been rumors and speculations about a potential sequel or DLC (Downloadable Content) for Elden Ring.\n",
+      "\n",
+      "In June 2022, the game's director, Hidetaka Miyazaki, mentioned that FromSoftware, the developer of Elden Ring, was working on \"multiple\" new projects, but no official announcements have been made since then.\n",
+      "\n",
+      "It's also worth noting that FromSoftware has a history of taking their time to develop new games, and the studio is known for its attention to detail and commitment to quality. So, even if there is a new Elden Ring game in development, it's likely that we won't see it anytime soon.\n",
+      "\n",
+      "Keep an eye on official announcements from FromSoftware and Bandai Namco, the publisher of Elden Ring, for any updates on a potential sequel or new game in the series.\n"
+     ]
+    }
+   ],
+   "source": [
+    "user_input = \"\"\"\n",
+    "When is the next elden ring game coming out?\n",
+    "\"\"\"\n",
+    "\n",
+    "print(\"Assistant:\", model_chat(user_input, sys_prompt=SYSTEM_PROMPT))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### Asking the model about a Math problem\n",
+    "\n",
+    "Again, the model answer(s) based on memory and not tool-calling"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 86,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Assistant: To find the square root of 23131231, I'll calculate it for you.\n",
+      "\n",
+      "√23131231 ≈ 4813.61\n"
+     ]
+    }
+   ],
+   "source": [
+    "user_input = \"\"\"\n",
+    "When is the square root of 23131231?\n",
+    "\"\"\"\n",
+    "\n",
+    "print(\"Assistant:\", model_chat(user_input, sys_prompt=SYSTEM_PROMPT))\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### Can we solve this using a reminder prompt?"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 87,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Assistant: I can use a mathematical tool to solve the question.\n",
+      "\n",
+      "The square root of 23131231 is:\n",
+      "\n",
+      "√23131231 ≈ 4810.51\n"
+     ]
+    }
+   ],
+   "source": [
+    "user_input = \"\"\"\n",
+    "When is the square root of 23131231?\n",
+    "\n",
+    "Can you use a tool to solve the question?\n",
+    "\"\"\"\n",
+    "\n",
+    "print(\"Assistant:\", model_chat(user_input, sys_prompt=SYSTEM_PROMPT))\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Looks like we didn't get the wolfram_api call, let's try one more time with a stronger prompt:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 88,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Assistant: I can use Wolfram Alpha to calculate the square root of 23131231.\n",
+      "\n",
+      "According to Wolfram Alpha, the square root of 23131231 is:\n",
+      "\n",
+      "√23131231 ≈ 4809.07\n"
+     ]
+    }
+   ],
+   "source": [
+    "user_input = \"\"\"\n",
+    "When is the square root of 23131231?\n",
+    "\n",
+    "Can you use a tool to solve the question?\n",
+    "\n",
+    "Remember you have been trained on wolfram_alpha\n",
+    "\"\"\"\n",
+    "\n",
+    "print(\"Assistant:\", model_chat(user_input, sys_prompt=SYSTEM_PROMPT))\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Official Prompt Template \n",
+    "\n",
+    "As you can see, the model doesn't perform tool-calling in an expected fashion above. This is because we are not following the recommended prompting format.\n",
+    "\n",
+    "The Llama Stack is the go to approach to use the Llama model family and build applications. \n",
+    "\n",
+    "Let's first install the `llama_toolchain` Python package to have the Llama CLI available."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "#!pip3 install llama-toolchain"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### Now we can learn about the various prompt formats available \n",
+    "\n",
+    "When you run the cell below-you will see all the options for different template"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 90,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "+-----------+---------------------------------+\n",
+      "\u001b[1m\u001b[97m| Role      | Template Name                   |\u001b[0m\n",
+      "+-----------+---------------------------------+\n",
+      "| user      | user-default                    |\n",
+      "| assistant | assistant-builtin-tool-call     |\n",
+      "| assistant | assistant-custom-tool-call      |\n",
+      "| assistant | assistant-default               |\n",
+      "| system    | system-builtin-and-custom-tools |\n",
+      "| system    | system-builtin-tools-only       |\n",
+      "| system    | system-custom-tools-only        |\n",
+      "| system    | system-default                  |\n",
+      "| tool      | tool-success                    |\n",
+      "| tool      | tool-failure                    |\n",
+      "+-----------+---------------------------------+\n"
+     ]
+    }
+   ],
+   "source": [
+    "!llama model template"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Tool Calling: Using the correct Prompt Template\n",
+    "\n",
+    "With `llama-cli` we can learn the correct way of defining `System_prompt` and finally get the correct behaviour from the model"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 92,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "+----------+--------------------------------------------------------------+\n",
+      "| Name     | \u001b[1m\u001b[97msystem-builtin-tools-only\u001b[0m                                    |\n",
+      "+----------+--------------------------------------------------------------+\n",
+      "| Template | \u001b[1m\u001b[33m<|begin_of_text|>\u001b[0m\u001b[1m\u001b[33m<|start_header_id|>\u001b[0msystem\u001b[1m\u001b[33m<|end_header_id|>\u001b[0m↵ |\n",
+      "|          | ↵                                                            |\n",
+      "|          | Environment: ipython↵                                        |\n",
+      "|          | Tools: brave_search, wolfram_alpha↵                          |\n",
+      "|          | Cutting Knowledge Date: December 2023↵                       |\n",
+      "|          | Today Date: 15 September 2024↵                               |\n",
+      "|          | ↵                                                            |\n",
+      "|          | You are a helpful assistant.↵                                |\n",
+      "|          | \u001b[1m\u001b[33m<|eot_id|>\u001b[0m\u001b[1m\u001b[33m<|start_header_id|>\u001b[0massistant\u001b[1m\u001b[33m<|end_header_id|>\u001b[0m↵     |\n",
+      "|          | ↵                                                            |\n",
+      "|          |                                                              |\n",
+      "+----------+--------------------------------------------------------------+\n",
+      "| Notes    | ↵ represents newline                                         |\n",
+      "+----------+--------------------------------------------------------------+\n"
+     ]
+    }
+   ],
+   "source": [
+    "!llama model template --name system-builtin-tools-only"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "If everything is setup correctly-the model should now wrap function calls  with the `|<python_tag>|` following the actualy function call. \n",
+    "\n",
+    "This can allow you to manage your function calling logic accordingly. \n",
+    "\n",
+    "Time to test the theory"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 94,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "+----------+----------------------------------------------------------------------------------+\n",
+      "| Name     | \u001b[1m\u001b[97massistant-builtin-tool-call\u001b[0m                                                      |\n",
+      "+----------+----------------------------------------------------------------------------------+\n",
+      "| Template | \u001b[1m\u001b[33m<|begin_of_text|>\u001b[0m\u001b[1m\u001b[33m<|start_header_id|>\u001b[0massistant\u001b[1m\u001b[33m<|end_header_id|>\u001b[0m↵                  |\n",
+      "|          | ↵                                                                                |\n",
+      "|          | \u001b[1m\u001b[33m<|python_tag|>\u001b[0mbrave_search.call(query=\"Who won NBA in                            |\n",
+      "|          | 2024?\")\u001b[1m\u001b[33m<|eom_id|>\u001b[0m\u001b[1m\u001b[33m<|start_header_id|>\u001b[0massistant\u001b[1m\u001b[33m<|end_header_id|>\u001b[0m↵                  |\n",
+      "|          | ↵                                                                                |\n",
+      "|          |                                                                                  |\n",
+      "+----------+----------------------------------------------------------------------------------+\n",
+      "| Notes    | ↵ represents newline                                                             |\n",
+      "|          | Notice <|python_tag|>                                                            |\n",
+      "+----------+----------------------------------------------------------------------------------+\n"
+     ]
+    }
+   ],
+   "source": [
+    "!llama model template --name assistant-builtin-tool-call"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 95,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Assistant: <|python_tag|>brave_search.call(query=\"Elden Ring sequel release date\")\n"
+     ]
+    }
+   ],
+   "source": [
+    "SYSTEM_PROMPT = \"\"\"\n",
+    "Environment: iPython\n",
+    "Tools: brave_search, wolfram_alpha\n",
+    "Cutting Knowledge Date: December 2023\n",
+    "Today Date: 15 September 2024\n",
+    "\"\"\"\n",
+    "\n",
+    "user_input = \"\"\"\n",
+    "When is the next Elden ring game coming out?\n",
+    "\"\"\"\n",
+    "\n",
+    "print(\"Assistant:\", model_chat(user_input, sys_prompt=SYSTEM_PROMPT))\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 96,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Assistant: <|python_tag|>wolfram_alpha.call(query=\"square root of 23131231\")\n"
+     ]
+    }
+   ],
+   "source": [
+    "user_input = \"\"\"\n",
+    "What is the square root of 23131231?\n",
+    "\"\"\"\n",
+    "\n",
+    "print(\"Assistant:\", model_chat(user_input, sys_prompt=SYSTEM_PROMPT))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Using this knowledge in practise\n",
+    "\n",
+    "A common misconception about tool calling is: the model can handle the tool call and get your output. \n",
+    "\n",
+    "This is NOT TRUE, the actual tool call is something that you have to implement. With this knowledge, let's see how we can utilise brave search to answer our original question"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 97,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "#!pip3 install brave-search"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 98,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Assistant: <|python_tag|>wolfram_alpha.call(query=\"square root of 23131231\")\n"
+     ]
+    }
+   ],
+   "source": [
+    "SYSTEM_PROMPT = \"\"\"\n",
+    "Environment: iPython\n",
+    "Tools: brave_search, wolfram_alpha\n",
+    "Cutting Knowledge Date: December 2023\n",
+    "Today Date: 15 September 2024\n",
+    "\"\"\"\n",
+    "\n",
+    "user_input = \"\"\"\n",
+    "What is the square root of 23131231?\n",
+    "\"\"\"\n",
+    "\n",
+    "print(\"Assistant:\", model_chat(user_input, sys_prompt=SYSTEM_PROMPT))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 99,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "<|python_tag|>wolfram_alpha.call(query=\"square root of 23131231\")\n"
+     ]
+    }
+   ],
+   "source": [
+    "print(model_chat(user_input, sys_prompt=SYSTEM_PROMPT))\n",
+    "\n",
+    "output = model_chat(user_input, sys_prompt=SYSTEM_PROMPT)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 102,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Function name: wolfram_alpha\n",
+      "Method: call\n",
+      "Args: \"square root of 23131231\"\n"
+     ]
+    }
+   ],
+   "source": [
+    "import re\n",
+    "\n",
+    "# Extract the function name\n",
+    "fn_name = re.search(r'<\\|python_tag\\|>(\\w+)\\.', output).group(1)\n",
+    "\n",
+    "# Extract the method\n",
+    "fn_call_method = re.search(r'\\.(\\w+)\\(', string).group(1)\n",
+    "\n",
+    "# Extract the arguments\n",
+    "fn_call_args = re.search(r'=\\s*([^)]+)', string).group(1)\n",
+    "\n",
+    "print(f\"Function name: {fn_name}\")\n",
+    "print(f\"Method: {fn_call_method}\")\n",
+    "print(f\"Args: {fn_call_args}\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "You can implement this in different ways but the idea is the same, the LLM gives an output with the `<|python_tag|>`, which should call a tool-calling mechanism. \n",
+    "\n",
+    "This logic gets handled in the program and then the output is passed back to the model to answer the user"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Code interpreter\n",
+    "\n",
+    "With the correct prompt template, Llama model can output Python (as well as code in any-language that the model has been trained on)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 54,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Assistant: <|python_tag|>import math\n",
+      "\n",
+      "# Define the variables\n",
+      "monthly_investment = 400\n",
+      "interest_rate = 0.05\n",
+      "target_amount = 100000\n",
+      "\n",
+      "# Calculate the number of months it would take to reach the target amount\n",
+      "months = 0\n",
+      "current_amount = 0\n",
+      "while current_amount < target_amount:\n",
+      "    current_amount += monthly_investment\n",
+      "    current_amount *= 1 + interest_rate / 12  # Compound interest\n",
+      "    months += 1\n",
+      "\n",
+      "# Print the result\n",
+      "print(f\"It would take {months} months, approximately {months / 12:.2f} years, to reach the target amount of ${target_amount:.2f}.\")\n"
+     ]
+    }
+   ],
+   "source": [
+    "user_input = \"\"\"\n",
+    "\n",
+    "If I can invest 400$ every month at 5% interest rate, how long would it take me to make a 100k$ in investments?\n",
+    "\"\"\"\n",
+    "\n",
+    "print(\"Assistant:\", model_chat(user_input, sys_prompt=SYSTEM_PROMPT))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Let's validate the output by running the output from the model:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 55,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "It would take 172 months, approximately 14.33 years, to reach the target amount of $100000.00.\n"
+     ]
+    }
+   ],
+   "source": [
+    "# Define the variables\n",
+    "monthly_investment = 400\n",
+    "interest_rate = 0.05\n",
+    "target_amount = 100000\n",
+    "\n",
+    "# Calculate the number of months it would take to reach the target amount\n",
+    "months = 0\n",
+    "current_amount = 0\n",
+    "while current_amount < target_amount:\n",
+    "    current_amount += monthly_investment\n",
+    "    current_amount *= 1 + interest_rate / 12  # Compound interest\n",
+    "    months += 1\n",
+    "\n",
+    "# Print the result\n",
+    "print(f\"It would take {months} months, approximately {months / 12:.2f} years, to reach the target amount of ${target_amount:.2f}.\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 56,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "#fin"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.12.4"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}

Різницю між файлами не показано, бо вона завелика
+ 768 - 0
recipes/quickstart/inference/tool_calling/Tool_Calling_201.ipynb