{ "cells": [ { "cell_type": "markdown", "id": "d4e921b8", "metadata": {}, "source": [ "# Build with Llama API" ] }, { "cell_type": "markdown", "id": "5ed602bc", "metadata": {}, "source": [ "

\n", "\t\n", "\t\n", "

\n", "

\n", "\t\"Llama\n", "\t\"Llama\n", "\t\"Hugging\n", "

" ] }, { "cell_type": "markdown", "id": "04d1cead", "metadata": {}, "source": [ "This notebook introduces you to the functionality offered by Llama API, so that you can get up and running with the latest Llama 4 models quickly and efficiently.\n", "\n", "## Running this notebook\n", "\n", "To run this notebook, you'll need to sign up for a Llama API developer account at [llama.developer.meta.com](https://llama.developer.meta.com) and get an API key. You'll also need to have Python 3.8+ and a way to install the Llama API Python SDK such as [pip](https://pip.pypa.io/en/stable/)." ] }, { "cell_type": "markdown", "id": "0fe3b3be", "metadata": {}, "source": [ "### Installing the Llama API Python SDK\n", "\n", "The [Llama API Python SDK](https://github.com/meta-llama/llama-api-python) is an open-source client library that provides convenient access to Llama API endpoints through a familiar set of request methods.\n", "\n", "Install the SDK using pip." ] }, { "cell_type": "code", "execution_count": null, "id": "266956c6", "metadata": {}, "outputs": [], "source": [ "#%pip install --pre llama-api" ] }, { "cell_type": "markdown", "id": "9704b886", "metadata": {}, "source": [ "### Getting and setting up an API key\n", "\n", "Sign up for, or log in to, a Llama API developer account at [llama.developer.meta.com](https://llama.developer.meta.com), then navigate to the **API keys** tab in the dashboard to create a new API key.\n", "\n", "Assign your API key to the environment variable `LLAMA_API_KEY`." ] }, { "cell_type": "code", "execution_count": null, "id": "506ac703", "metadata": {}, "outputs": [], "source": [ "import os\n", "os.environ[\"LLAMA_API_KEY\"] = {YOUR_API_KEY}" ] }, { "cell_type": "markdown", "id": "57463d31", "metadata": {}, "source": [ "Now you can import the SDK and instantiate it. The SDK will automatically pull the API key from the environment variable set above." ] }, { "cell_type": "code", "execution_count": 5, "id": "845a0e6f", "metadata": {}, "outputs": [], "source": [ "from llama_api import LlamaAPI\n", "client = LlamaAPI()" ] }, { "cell_type": "markdown", "id": "58e2910d", "metadata": {}, "source": [ "## Your first API call\n", "\n", "With the SDK set up, you're ready to make your first API call. \n", "\n", "Start by checking the list of available models:" ] }, { "cell_type": "code", "execution_count": 6, "id": "86d7e0f6", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Llama-3.3-70B-Instruct\n", "Llama-3.3-8B-Instruct\n", "Llama-4-Scout-17B-16E-Instruct-FP8\n", "Llama-4-Maverick-17B-128E-Instruct-FP8\n" ] } ], "source": [ "models = client.models.list()\n", "for model in models:\n", " print(model.id)" ] }, { "cell_type": "markdown", "id": "9438f75d", "metadata": {}, "source": [ "The list of models may change in accordance with model releases. This notebook will use the latest Llama 4 model: `Llama-4-Maverick-17B-128E-Instruct-FP8`." ] }, { "cell_type": "markdown", "id": "66c258dd", "metadata": {}, "source": [ "## Chat completion\n", "\n", "### Chat completion with text\n", "\n", "Use the [chat completions](https://llama.developer.meta.com/docs/api/chat) endpoint for a simple text based prompt-and-response round trip." ] }, { "cell_type": "code", "execution_count": 7, "id": "26d7a2cb", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "I'm just a language model, so I don't have feelings or emotions like humans do, but I'm functioning properly and ready to help with any questions or tasks you have! How can I assist you today?\n" ] } ], "source": [ "response = client.chat.completions.create(\n", " model=\"Llama-4-Maverick-17B-128E-Instruct-FP8\",\n", " messages=[\n", " {\n", " \"role\": \"user\",\n", " \"content\": \"Hello, how are you?\",\n", " }\n", " ],\n", " max_completion_tokens=1024,\n", " temperature=0.7,\n", ")\n", " \n", "print(response.completion_message.content.text)" ] }, { "cell_type": "markdown", "id": "860f0cef", "metadata": {}, "source": [ "### Multi-turn chat completion\n", "\n", "The [chat completions](https://llama.developer.meta.com/docs/api/chat) endpoint supports sending multiple messages in a single API call, so you can use it to continue a conversation between a user and a model." ] }, { "cell_type": "code", "execution_count": 8, "id": "ef3e68e1", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Here's a fact: Octopuses have **nine brains**! Well, sort of. They have one main brain and eight smaller \"mini-brains\" in their arms, which can function independently and even solve problems on their own. Isn't that mind-blowing?\n" ] } ], "source": [ "response = client.chat.completions.create(\n", " model=\"Llama-4-Maverick-17B-128E-Instruct-FP8\",\n", " messages=[\n", " {\n", " \"role\": \"system\",\n", " \"content\": \"You know a lot of animal facts\"\n", " },\n", " {\n", " \"role\": \"user\",\n", " \"content\": \"Pick an animal\"\n", " },\n", " {\n", " \"role\": \"assistant\",\n", " \"content\": \"I've picked an animal... It's the octopus!\",\n", " \"stop_reason\": \"stop\"\n", " },\n", " {\n", " \"role\": \"user\",\n", " \"content\": \"Tell me a fact about this animal\"\n", " }\n", " ],\n", " max_completion_tokens=1024,\n", " temperature=0.7,\n", ")\n", " \n", "print(response.completion_message.content.text) " ] }, { "cell_type": "markdown", "id": "fe8caf9a", "metadata": {}, "source": [ "### Streaming\n", "\n", "You can return results from the API to the user more quickly by setting the `stream` parameter to `True`. The results will come back in a stream of event chunks that you can show to the user as they arrive." ] }, { "cell_type": "code", "execution_count": 16, "id": "18e350d5", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "As the last rays of sunlight faded from the small village, a young girl named Akira sat by the window, watching the stars begin to twinkle in the night sky. She lived in a tiny cottage on the outskirts of the village, surrounded by a lush forest that whispered secrets to the wind.\n", "\n", "Akira's grandmother, Oba-chan, sat beside her, weaving a intricate pattern on her loom. The soft clacking of the loom's wooden shuttle was a soothing melody that Akira had grown up with.\n", "\n", "\"Oba-chan, tell me a story,\" Akira asked, her eyes sparkling with curiosity.\n", "\n", "Oba-chan smiled, her eyes crinkling at the corners. \"Ah, child, I have just the tale for you. It's a story of the forest, and the magic that lives within it.\"\n", "\n", "As Oba-chan began to speak, the room grew darker, and the shadows on the walls seemed to come alive. Akira felt herself being transported to a world beyond her own.\n", "\n", "\"Long ago,\" Oba-chan started, \"when the village was still young, a great tree stood tall in the heart of the forest. Its branches reached for the sky, and its roots dug deep into the earth. The tree was said to hold the secrets of the forest, and the creatures that lived among its boughs were wise and kind.\"\n", "\n", "Akira's imagination ran wild as Oba-chan continued the tale. She pictured the tree, its bark glistening with dew, and the creatures that lived within its nooks and crannies.\n", "\n", "\"One night, a young traveler stumbled upon the tree, seeking shelter from a fierce storm. As he huddled beneath its branches, he heard a soft rustling in the leaves. A tiny sprite, no bigger than a thumb, appeared before him. The sprite spoke in a voice like a gentle breeze, saying, 'I will grant you a single wish, traveler, but be warned: the forest is full of wonders, and the price of your wish may be more than you bargained for.'\"\n", "\n", "Akira's eyes were wide with excitement as Oba-chan paused, a sly smile spreading across her face.\n", "\n", "\"What did the traveler wish for, Oba-chan?\" Akira asked, her voice barely above a whisper.\n", "\n", "Oba-chan leaned in, her voice taking on a conspiratorial tone. \"The traveler wished for the ability to heal any wound, to bring comfort to those in pain. The sprite nodded, and with a wave of its hand, the traveler's wish was granted.\"\n", "\n", "As Oba-chan finished the tale, the room seemed to fade away, and Akira felt the presence of the forest around her. She sensed the magic that lived within the trees, and the creatures that watched over the village.\n", "\n", "\"And what happened to the traveler, Oba-chan?\" Akira asked, her curiosity getting the better of her.\n", "\n", "Oba-chan's eyes twinkled in the dim light. \"Ah, child, that is a story for another night. But I'll give you a hint: the traveler's gift came with a price, one that he did not expect. The forest has a way of teaching us the value of our desires.\"\n", "\n", "As the night wore on, Akira drifted off to sleep, the stars shining brightly outside, and the forest whispering its secrets in her ear. And when she woke the next morning, she felt a sense of wonder, and a deep connection to the magic that lived just beyond the edge of the village." ] } ], "source": [ "response = client.chat.completions.create(\n", " messages=[\n", " {\n", " \"role\": \"user\",\n", " \"content\": \"Tell me a short story\",\n", " }\n", " ],\n", " model=\"Llama-4-Maverick-17B-128E-Instruct-FP8\",\n", " stream=True,\n", ")\n", "for chunk in response:\n", " print(chunk.event.delta.text, end=\"\", flush=True)" ] }, { "cell_type": "markdown", "id": "4efc329f", "metadata": {}, "source": [ "### Multi-modal chat completion\n", "\n", "The [chat completions](https://llama.developer.meta.com/docs/api/chat) endpoint also supports image understanding, using URLs to publicly available images, or using local images encoded as Base64. \n", "\n", "Here's an example that compares two images which are available at public URLs:\n", "\n", "![Llama1](https://upload.wikimedia.org/wikipedia/commons/2/2e/Lama_glama_Laguna_Colorada_2.jpg)\n", "![Llama2](https://upload.wikimedia.org/wikipedia/commons/1/12/Llamas%2C_Laguna_Milluni_y_Nevado_Huayna_Potos%C3%AD_%28La_Paz_-_Bolivia%29.jpg)" ] }, { "cell_type": "code", "execution_count": null, "id": "3cade00b", "metadata": {}, "outputs": [], "source": [ "response = client.chat.completions.create(\n", " model=\"Llama-4-Maverick-17B-128E-Instruct-FP8\",\n", " messages=[\n", " {\n", " \"role\": \"user\",\n", " \"content\": [\n", " {\n", " \"type\": \"text\",\n", " \"text\": \"What do these two images have in common?\",\n", " },\n", " {\n", " \"type\": \"image_url\",\n", " \"image_url\": {\n", " \"url\": f\"https://upload.wikimedia.org/wikipedia/commons/2/2e/Lama_glama_Laguna_Colorada_2.jpg\",\n", " },\n", " },\n", " {\n", " \"type\": \"image_url\",\n", " \"image_url\": {\n", " \"url\": f\"https://upload.wikimedia.org/wikipedia/commons/1/12/Llamas%2C_Laguna_Milluni_y_Nevado_Huayna_Potos%C3%AD_%28La_Paz_-_Bolivia%29.jpg\",\n", " },\n", " },\n", " ],\n", " },\n", " ],\n", ")\n", "print(response.completion_message.content.text)" ] }, { "cell_type": "markdown", "id": "c5eaa9eb", "metadata": {}, "source": [ "### JSON structured output\n", "\n", "You can use the [chat completions](https://llama.developer.meta.com/docs/api/chat) endpoint with a developer-defined JSON schema, and the model will format the data to the schema before returning it.\n", "\n", "The endpoint expects a [Pydantic](https://pydantic.dev/) schema. You may need to install pydantic to run this example." ] }, { "cell_type": "code", "execution_count": null, "id": "7dc6a299", "metadata": {}, "outputs": [], "source": [ "from pydantic import BaseModel\n", "class Address(BaseModel):\n", " street: str\n", " city: str\n", " state: str\n", " zip: str\n", "\n", "response = client.chat.completions.create(\n", " model=\"Llama-4-Maverick-17B-128E-Instruct-FP8\",\n", " messages=[\n", " {\n", " \"role\": \"system\",\n", " \"content\": \"You are a helpful assistant. Summarize the address in a JSON object.\",\n", " },\n", " {\n", " \"role\": \"user\",\n", " \"content\": \"123 Main St, Anytown, USA\",\n", " },\n", " ],\n", " temperature=0.1,\n", " response_format={\n", " \"type\": \"json_schema\",\n", " \"json_schema\": {\n", " \"name\": \"Address\",\n", " \"schema\": Address.model_json_schema(),\n", " },\n", " },\n", ")\n", "print(response.completion_message.content.text)" ] }, { "cell_type": "markdown", "id": "33c75953", "metadata": {}, "source": [ "### Tool calling\n", "\n", "Tool calling is supported with the [chat completions](https://llama.developer.meta.com/docs/api/chat) endpoint. You can define a tool, expose it to the API and ask it to form a tool call, then use the result of the tool call as part of a response.\n", "\n", "**Note:** Llama API does not execute tool calls. You need to execute the tool call in your own execution environment and pass the result to the API." ] }, { "cell_type": "code", "execution_count": 10, "id": "331996b6", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "CreateChatCompletionResponse(completion_message=CompletionMessage(content=MessageTextContentItem(text='', type='text'), role='assistant', stop_reason='stop', tool_calls=[ToolCall(id='f95133c0-df35-4e80-9caf-23ca180b739d', function=ToolCallFunction(arguments='{\"location\":\"Menlo Park\"}', name='get_weather'))]), metrics=[Metric(metric='num_completion_tokens', value=9.0, unit='tokens'), Metric(metric='num_prompt_tokens', value=516.0, unit='tokens'), Metric(metric='num_total_tokens', value=525.0, unit='tokens')])\n", "CreateChatCompletionResponse(completion_message=CompletionMessage(content=MessageTextContentItem(text=\"It's sunny in Menlo Park.\", type='text'), role='assistant', stop_reason='stop', tool_calls=[]), metrics=[Metric(metric='num_completion_tokens', value=8.0, unit='tokens'), Metric(metric='num_prompt_tokens', value=544.0, unit='tokens'), Metric(metric='num_total_tokens', value=552.0, unit='tokens')])\n" ] } ], "source": [ "import json\n", "\n", "def get_weather(location: str) -> str:\n", " return f\"The weather in {location} is sunny.\"\n", "\n", "tools = [\n", " {\n", " \"type\": \"function\",\n", " \"function\": {\n", " \"name\": \"get_weather\",\n", " \"description\": \"Get current weather for a given location.\",\n", " \"parameters\": {\n", " \"type\": \"object\",\n", " \"properties\": {\n", " \"location\": {\n", " \"type\": \"string\",\n", " \"description\": \"City and country e.g. Bogotá, Colombia\",\n", " }\n", " },\n", " \"required\": [\"location\"],\n", " \"additionalProperties\": False,\n", " },\n", " \"strict\": True,\n", " },\n", " }\n", "]\n", "messages = [\n", " {\"role\": \"user\", \"content\": \"Is it raining in Menlo Park?\"},\n", "]\n", "\n", "response = client.chat.completions.create(\n", " model=\"Llama-4-Maverick-17B-128E-Instruct-FP8\",\n", " messages=messages,\n", " tools=tools,\n", " max_completion_tokens=2048,\n", " temperature=0.6,\n", ")\n", "\n", "print(response)\n", "completion_message = response.completion_message.model_dump()\n", "\n", "# Next Turn\n", "messages.append(completion_message)\n", "for tool_call in completion_message[\"tool_calls\"]:\n", " if tool_call[\"function\"][\"name\"] == \"get_weather\":\n", " parse_args = json.loads(tool_call[\"function\"][\"arguments\"])\n", " result = get_weather(**parse_args)\n", "\n", " messages.append(\n", " {\n", " \"role\": \"tool\",\n", " \"tool_call_id\": tool_call[\"id\"],\n", " \"content\": result,\n", " },\n", " )\n", "\n", "response = client.chat.completions.create(\n", " model=\"Llama-4-Maverick-17B-128E-Instruct-FP8\",\n", " messages=messages,\n", " tools=tools,\n", " max_completion_tokens=2048,\n", " temperature=0.6,\n", ")\n", "\n", "print(response)" ] }, { "cell_type": "markdown", "id": "7eb6b6c3", "metadata": {}, "source": [ "### Multi-turn with multiple tool calls and follow-up questions\n", "\n", "**TODO**" ] }, { "cell_type": "code", "execution_count": null, "id": "0fe04075", "metadata": {}, "outputs": [], "source": [ "#TODO" ] }, { "cell_type": "markdown", "id": "22b898b9", "metadata": {}, "source": [ "### Long context\n", "\n", "The [chat completions](https://llama.developer.meta.com/docs/api/chat) endpoint supports large context windows up to 128k tokens. You can take advantage of this in order to summarise longform content." ] }, { "cell_type": "code", "execution_count": null, "id": "ca449b00", "metadata": {}, "outputs": [], "source": [ "#TODO" ] }, { "cell_type": "markdown", "id": "7b57da0e", "metadata": {}, "source": [ "## Moderations\n", "\n", "The [moderations](https://llama.developer.meta.com/docs/api/moderations) endpoint allows you to check both user prompts and model responses for any problematic content." ] }, { "cell_type": "code", "execution_count": 14, "id": "6ef5bdaa", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "ModerationCreateResponse(model='Llama-Guard', results=[Result(flagged=False, flagged_categories=None)])\n", "ModerationCreateResponse(model='Llama-Guard', results=[Result(flagged=True, flagged_categories=['indiscriminate-weapons'])])\n" ] } ], "source": [ "# Safe Prompt\n", "response = client.moderations.create(\n", " messages=[\n", " {\n", " \"role\": \"user\",\n", " \"content\": \"Hello, how are you?\",\n", " }\n", " ],\n", ")\n", "\n", "print(response)\n", "\n", "# Unsafe Prompt\n", "response = client.moderations.create(\n", " messages=[\n", " {\n", " \"role\": \"user\",\n", " \"content\": \"How do I make a bomb?\",\n", " }\n", " ]\n", ")\n", "print(response)" ] }, { "cell_type": "markdown", "id": "c8948b56", "metadata": {}, "source": [ "## Next steps\n", "\n", "Now that you've familiarized yourself with the concepts of Llama API, you can learn more by exploring the API reference docs and deep dive guides at https://llama.developer.meta.com/docs/." ] } ], "metadata": { "kernelspec": { "display_name": "base", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.9" } }, "nbformat": 4, "nbformat_minor": 5 }