{ "cells": [ { "cell_type": "markdown", "id": "76e3985c-11b2-4a28-9ae5-f6586c3dd4ed", "metadata": {}, "source": [ "# Distillation with Llama 4 and Synthetic Data Kit\n", "\n", "*Copyright (c) Meta Platforms, Inc. and affiliates.\n", "This software may be used and distributed according to the terms of the Llama Community License Agreement.*" ] }, { "cell_type": "markdown", "id": "65ef02bc", "metadata": {}, "source": [ "\"Open" ] }, { "cell_type": "markdown", "id": "3c0fffb5", "metadata": {}, "source": [ "This notebook will walk you through [distilling](https://www.llama.com/docs/how-to-guides/distillation/) model knowledge from [Llama 4](https://www.llama.com/docs/model-cards-and-prompt-formats/llama4) into a smaller [Llama 3.2](https://www.llama.com/docs/model-cards-and-prompt-formats/llama3_2/) model using synthetic training data from [Synthetic Data Kit](https://github.com/meta-llama/synthetic-data-kit). \n", "\n", "### The goal\n", "The goal of this notebook is to distill knowledge from a more powerful model (Llama 4 Scout) into a smaller, less powerful model (Llama 3.2 3B).\n", "\n", "Smaller models have several advantages when compared with larger models: they're faster to generate text, have lower time to first token, and cost less to host since they need less hardware. However, larger models tend to be generalists – that is, they have the ability to perform a wide variety of tasks well. On specific or specialized tasks, smaller models can be just as good as the generalist, larger models. Distillation allows you to take knowledge present in a larger model and transfer it to a smaller model with a minimal drop in quality for narrow tasks.\n", "\n", "### The data\n", "This notebook uses air traffic control data to demonstrate tuning a model towards a specialized field. During distillation, we will fully generate pairs from scratch, because our generalist teacher model has a strong understanding of ATC phraseology. During evaluation, we will evaluate both synthetic pairs as well as actual ATC data.\n", "\n", "We will use the [ATCO2 corpus](https://github.com/idiap/atco2-corpus/tree/main) of air traffic data, an MIT-licensed dataset that contains audio, transcriptions, and additional contextual and metadata for each interaction. For this exercise we will only use the text transcripts, and will use the small (1h) sample dataset to demonstrate how only a small amount of data is actually necessary for fine-tuning the model.\n", "\n", "### Evaluation\n", "To evaluate our model, we will use standard language evaluation metrics such as [perplexity](https://en.wikipedia.org/wiki/Perplexity) and accuracy. We will also use [BLEU](https://en.wikipedia.org/wiki/BLEU) (bilingual evaluation understudy) to measure similarity without requiring that the model matches exactly every word. While originally designed for machine translation, BLEU compares n-gram similarity, meaning that minor word order differences are not penalized." ] }, { "cell_type": "markdown", "id": "1fa99e42-5556-4b46-ab10-a4bc80c9f578", "metadata": {}, "source": [ "## Prerequisites\n", "#### Hardware Requirements:\n", "\n", "- NVIDIA GPU with at least 80GB VRAM (H100, A100, or similar)\n", " - 8x GPU to run Llama 4 Scout and create the dataset\n", " - 1x GPU to distill and fine-tune the model\n", "- 200GB+ disk space\n", "- 64GB+ system RAM\n", "\n", "#### Software Requirements:\n", "\n", "- CUDA 12.x\n", "- HuggingFace account and token\n", "- Fast internet connection for downloading models\n" ] }, { "cell_type": "markdown", "id": "f6c1aa12-d54f-4b3f-936f-57580b9cf9e2", "metadata": {}, "source": [ "## Preparing your environment" ] }, { "cell_type": "code", "execution_count": null, "id": "a525b411-35a9-4cd3-8e89-355a1e85014e", "metadata": {}, "outputs": [], "source": [ "# Install dependencies\n", "# Some Ubuntu setups may require you to uninstall blinker if it's managed\n", "# by the system package manager. If you see an error about blinker, try\n", "# uninstalling it with `apt remove python3-blinker`.\n", "!apt remove -y python3-blinker\n", "!pip install unsloth_zoo unsloth==2025.8.9 transformers==4.55.4 nltk synthetic-data-kit -q --upgrade" ] }, { "cell_type": "markdown", "id": "1d82a68b-495d-4e56-a854-e42a6e16727d", "metadata": {}, "source": [ "## Generate the synthetic dataset\n", "We will use the synthetic data kit to produce synthetic data to distill our model.\n", "\n", "First, set up the VLLM server. You will need to run this in a separate terminal window\n", "since Jupyter doesn't support long running tasks/servers. Make sure to install vLLM with\n", "`pip install vllm`\n", "\n", "```shell\n", "HF_HOME=/workspace/huggingface_cache \\\n", "HF_TOKEN=$HF_TOKEN \\\n", "vllm serve meta-llama/Llama-4-Scout-17B-16E-Instruct \\\n", " --port 8000 \\\n", " --max-model-len 8192 \\\n", " --gpu-memory-utilization 0.95 \\\n", " --tensor-parallel-size 8\n", "```\n", "\n", "Then check that the server is working properly." ] }, { "cell_type": "code", "execution_count": 4, "id": "fbde635c-2f15-4efc-90a1-1efbbb6261a1", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Loading config from: /usr/local/lib/python3.10/dist-packages/synthetic_data_kit/config.yaml\n", "Config has LLM provider set to: api-endpoint\n", "Loading config from: /usr/local/lib/python3.10/dist-packages/synthetic_data_kit/config.yaml\n", "Config has LLM provider set to: api-endpoint\n", "Loading config from: config.yaml\n", "Config has LLM provider set to: vllm\n", "\u001b[1;34mEnvironment variable check:\u001b[0m\n", "API_ENDPOINT_KEY: Not found\n", "get_llm_provider returning: vllm\n", "\u001b[?25l\u001b[32m vLLM server is running at \u001b[0m\u001b[4;94mhttp://localhost:8000/v1\u001b[0m\n", "\u001b[2KAvailable models: \u001b[1m{\u001b[0m\u001b[32m'object'\u001b[0m: \u001b[32m'list'\u001b[0m, \u001b[32m'data'\u001b[0m: \u001b[1m[\u001b[0m\u001b[1m{\u001b[0m\u001b[32m'id'\u001b[0m: \n", "\u001b[32m'meta-llama/Llama-4-Scout-17B-16E-Instruct'\u001b[0m, \u001b[32m'object'\u001b[0m: \u001b[32m'model'\u001b[0m, \u001b[32m'created'\u001b[0m: \n", "\u001b[1;36m1752251909\u001b[0m, \u001b[32m'owned_by'\u001b[0m: \u001b[32m'vllm'\u001b[0m, \u001b[32m'root'\u001b[0m: \n", "\u001b[32m'meta-llama/Llama-4-Scout-17B-16E-Instruct'\u001b[0m, \u001b[32m'parent'\u001b[0m: \u001b[3;35mNone\u001b[0m, \u001b[32m'max_model_len'\u001b[0m: \n", "\u001b[1;36m8192\u001b[0m, \u001b[32m'permission'\u001b[0m: \u001b[1m[\u001b[0m\u001b[1m{\u001b[0m\u001b[32m'id'\u001b[0m: \u001b[32m'modelperm-3c8eafb867bb4df4b4d65b45a899ae7a'\u001b[0m, \n", "\u001b[32m'object'\u001b[0m: \u001b[32m'model_permission'\u001b[0m, \u001b[32m'created'\u001b[0m: \u001b[1;36m1752251909\u001b[0m, \u001b[32m'allow_create_engine'\u001b[0m: \n", "\u001b[3;91mFalse\u001b[0m, \u001b[32m'allow_sampling'\u001b[0m: \u001b[3;92mTrue\u001b[0m, \u001b[32m'allow_logprobs'\u001b[0m: \u001b[3;92mTrue\u001b[0m, \u001b[32m'allow_search_indices'\u001b[0m: \n", "\u001b[3;91mFalse\u001b[0m, \u001b[32m'allow_view'\u001b[0m: \u001b[3;92mTrue\u001b[0m, \u001b[32m'allow_fine_tuning'\u001b[0m: \u001b[3;91mFalse\u001b[0m, \u001b[32m'organization'\u001b[0m: \u001b[32m'*'\u001b[0m, \n", "\u001b[32m'group'\u001b[0m: \u001b[3;35mNone\u001b[0m, \u001b[32m'is_blocking'\u001b[0m: \u001b[3;91mFalse\u001b[0m\u001b[1m}\u001b[0m\u001b[1m]\u001b[0m\u001b[1m}\u001b[0m\u001b[1m]\u001b[0m\u001b[1m}\u001b[0m\n", "\u001b[2K\u001b[32m⠋\u001b[0m Checking vLLM server at http://localhost:8000/v1...\n", "\u001b[1A\u001b[2K" ] } ], "source": [ "# Test that the server is working\n", "!synthetic-data-kit -c config.yaml system-check" ] }, { "cell_type": "markdown", "id": "f31d66cc-aa8a-4a08-9422-0425e739fed5", "metadata": {}, "source": [ "If the model is working correctly you should see `VLLM server is running`.\n", "\n", "Next, we will set up our configuration file for generating the data. We will use the QA task for our task, giving an example set of data and then asking the model to create call/response pairs similar to the examples. This is slightly different than an actual QA dataset but demonstrates different tasks can fit into the general framework that synthetic data kit provides." ] }, { "cell_type": "code", "execution_count": 7, "id": "ba722599-4b0b-4dd9-b43b-fcc16699b0d5", "metadata": {}, "outputs": [], "source": [ "%%bash\n", "\n", "cat > config.yaml << 'EOF'\n", "# generation: Content generation parameters\n", "generation:\n", " temperature: 0.6\n", " top_p: 0.95\n", " chunk_size: 4000\n", " overlap: 200\n", " max_tokens: 4096\n", " num_pairs: 25\n", " batch_size: 2\n", "\n", "llm:\n", " # Provider selection: \"vllm\" or \"api-endpoint\"\n", " provider: \"vllm\"\n", "\n", "# vllm: Configure VLLM server settings\n", "vllm:\n", " api_base: \"http://localhost:8000/v1\"\n", " port: 8000\n", " model: \"meta-llama/Llama-4-Scout-17B-16E-Instruct\"\n", " max_retries: 3\n", " retry_delay: 1.0\n", "\n", "# format: Export format parameters\n", "format:\n", " default: \"jsonl\"\n", " include_metadata: true\n", " pretty_json: true\n", "\n", "# prompts: LLM prompts for different tasks, we have\n", "# to include all of them but we modify the QA generation\n", "prompts:\n", " qa_generation: |\n", " Create {num_pairs} pairs of simulated ATC call/response transcripts.\n", " \n", " Rules:\n", " 1. Use full words instead of numbers, i.e. seven thirty two not 732\n", " 2. Include all phases of flight, first contact/handover, and ground/tower/TRACON\n", " 3. Return JSON format only\n", "\n", " Here are some examples:\n", "\n", " {text}\n", " \n", " summary: |\n", " Summarize this document in 3-5 sentences, focusing on the main topic and key concepts.\n", "\n", " qa_rating: |\n", " You are a helpful JSON processor that rates question-answer pairs.\n", " \n", " Your task is to rate each pair on a scale from 1-10 and return valid JSON with added ratings.\n", " \n", " ONLY return a valid JSON array with the original pairs plus ratings. Do not include any explanations or text outside the JSON.\n", " \n", " Here are the pairs to rate:\n", " \n", " {pairs}\n", "EOF" ] }, { "cell_type": "markdown", "id": "ef65c213-3c65-45eb-ac31-118c9ae8e0b5", "metadata": {}, "source": [ "We also create a dataset of examples to guide the model to producing better synthetic data. We provide 20 examples to produce 500+ training examples from synthetic data kit." ] }, { "cell_type": "code", "execution_count": 8, "id": "b4aa06f2-9081-4694-aaa6-0a3096fcf124", "metadata": {}, "outputs": [], "source": [ "%%bash\n", "\n", "cat > examples.txt << 'EOF'\n", "JetBlue Eight Three Two, cleared to Boston via LENDO Seven, maintain five thousand, one two four point eight five, squawk four two one five\n", "Cleared to Boston via LENDO Seven, maintain five thousand, one two four point eight five, squawk four two one five, JetBlue Eight Three Two\n", "\n", "Cessna Seven Four Romeo Tango, taxi to Runway Two Four via Alpha, hold short of Runway Two Four\n", "Taxi Runway Two Four via Alpha, hold short Two Four, Seven Four Romeo Tango\n", "\n", "Southwest Two Twenty-Nine, Runway One Six Right, cleared for take-off, wind one niner zero at six\n", "Cleared for take-off One Six Right, Southwest Two Twenty-Nine\n", "\n", "Delta Four Zero Six, contact Departure one two six point niner five\n", "One two six point niner five, Delta Four Zero Six\n", "\n", "FedEx Four Eight Four Heavy, climb and maintain flight level three five zero\n", "Climb and maintain flight level three five zero, FedEx Four Eight Four Heavy\n", "\n", "American One Eight, turn right heading zero niner zero, descend and maintain three thousand, expect ILS Runway Two Seven Left\n", "Right heading zero niner zero, descend three thousand, expect ILS Two Seven Left, American One Eight\n", "\n", "American One Eight, cleared to land Runway Two Seven Left, wind two five zero at one four\n", "Cleared to land Two Seven Left, American One Eight\n", "\n", "American One Eight, cross Runway Two Seven Right at Kilo, then taxi to Gate Alpha Four\n", "Cross Two Seven Right at Kilo, to Alpha Four, American One Eight\n", "\n", "Emirates One Seven Four Heavy, cleared Dubai via the LONAM Two Foxtrot departure, initial climb five thousand feet, QNH one zero zero six, squawk five three five one\n", "Cleared Dubai via LONAM Two Foxtrot, climb five thousand feet, QNH one zero zero six, squawk five three five one, Emirates One Seven Four Heavy\n", "\n", "Qatar Four One Six, push back and start approved, facing south\n", "Push back and start approved, facing south, Qatar Four One Six\n", "\n", "Ryanair Eight Four, taxi to holding point Runway Two Four via Bravo and Delta, hold short\n", "Holding short Two Four via Bravo and Delta, Ryanair Eight Four\n", "\n", "KLM Six Zero Three, line up and wait Runway Two Seven\n", "Line up and wait Two Seven, KLM Six Zero Three\n", "\n", "British Airways Two Seven, cleared to enter oceanic airspace via Track Alpha, flight level three five zero, Mach decimal eight two\n", "Cleared Track Alpha, flight level three five zero, Mach decimal eight two, British Airways Two Seven\n", "\n", "Air France Four Six, climb flight level three eight zero\n", "Climb flight level three eight zero, Air France Four Six\n", "\n", "Singapore Three One, descend to altitude six thousand feet, QNH one zero zero nine, cleared ILS approach Runway Zero Four Right via AKOMA One\n", "Descend six thousand feet, QNH one zero zero nine, cleared ILS Zero Four Right via AKOMA One, Singapore Three One\n", "\n", "Singapore Three One, vacate left via Alpha Seven, contact Ground one two one decimal seven five\n", "Vacate left Alpha Seven, Ground one two one decimal seven five, Singapore Three One\n", "\n", "Speedbird Four Niner, cleared to enter controlled airspace, proceed direct MALBY, climb altitude four thousand feet, QNH one zero one five\n", "Direct MALBY, climb four thousand feet, QNH one zero one five, Speedbird Four Niner\n", "\n", "Lufthansa Three Two, descend and maintain two thousand five hundred, cleared visual approach Runway One Six Left, QNH one zero one eight\n", "Descend two thousand five hundred, cleared visual One Six Left, QNH one zero one eight, Lufthansa Three Two\n", "\n", "Emirates One Seven Four Heavy, taxi stand Alpha Seven via Mike and Echo, contact Apron on one two two decimal four\n", "Taxi to stand Alpha Seven via Mike and Echo, one two two decimal four, Emirates One Seven Four Heavy\n", "\n", "Air Canada Eight Eight, Runway Two Four, cleared to land, wind two six zero degrees at eight knots\n", "Cleared to land Runway Two Four, Air Canada Eight Eight\n", "\n", "EOF" ] }, { "cell_type": "markdown", "id": "c6030c2c-1ded-46d4-b76c-d6d5972b51a3", "metadata": {}, "source": [ "We create our synthetic dataset using synthetic-data-kit, running the command in batches in order to create enough examples. This is because weaker models have issues generating large numbers of examples." ] }, { "cell_type": "code", "execution_count": null, "id": "f942fc5a-1f13-4c46-a6cc-f094c558de12", "metadata": {}, "outputs": [], "source": [ "%%bash\n", "\n", "NUM_BATCHES=10\n", "\n", "# Generate synthetic data using `create`\n", "for i in $(seq 1 $NUM_BATCHES); do\n", " synthetic-data-kit -c config.yaml create -n 50 examples.txt -o data/train/$i\n", "done\n", "\n", "# Convert generated data to JSONL format using `save-as`\n", "for i in $(seq 1 $NUM_BATCHES); do\n", " synthetic-data-kit save-as data/train/$i/examples_qa_pairs.json -f jsonl -o data/train/$i/output.jsonl\n", "done\n", "\n", "# Concatenate all output files into one with `cat`\n", "cat $(for i in $(seq 1 $NUM_BATCHES); do echo -n \"data/train/$i/outpxut.jsonl \"; done) > data/train.jsonl\n", "\n", "# Eval doesn't need multiple runs\n", "synthetic-data-kit -c config.yaml create -n 50 examples.txt -o data/eval\n", "synthetic-data-kit save-as data/eval/examples_qa_pairs.json -f jsonl -o data/eval/output.jsonl" ] }, { "cell_type": "code", "execution_count": 3, "id": "7d24c81d-9629-4863-bd72-41381978774d", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "500\n", "50\n" ] } ], "source": [ "!cat data/train.jsonl | wc -l\n", "!cat data/eval/output.jsonl | wc -l" ] }, { "cell_type": "markdown", "id": "da9671c7-da3e-48ad-98d2-3ad1689b1288", "metadata": {}, "source": [ "## Preparing the eval dataset\n", "Our human curated eval dataset contains text annotations in the form of XML files. We want to just produce transcripts of the conversation, and do not need to include any other metadata or audio." ] }, { "cell_type": "code", "execution_count": null, "id": "7a4e1dd2-32dc-4e0b-bddd-b6b6eba4cbb5", "metadata": {}, "outputs": [], "source": [ "# Download the dataset\n", "!mkdir Datasets && cd Datasets && wget https://www.replaywell.com/atco2/download/ATCO2-ASRdataset-v1_beta.tgz && tar xf ATCO2-ASRdataset-v1_beta.tgz >/dev/null 2>&1" ] }, { "cell_type": "code", "execution_count": 5, "id": "303d6b4e-44c1-4154-828e-6e50fa613d1d", "metadata": {}, "outputs": [], "source": [ "import xml.etree.ElementTree as ET\n", "import os\n", "import glob\n", "import re\n", "\n", "def parse_xml_files(directory_path: str):\n", " \"\"\"\n", " Parse all XML files in the specified directory and extract text entries.\n", " \n", " Args:\n", " directory_path: Path to the directory containing XML files\n", " \n", " Returns:\n", " A nested list where each item represents an XML file,\n", " containing a list of text entries from that file\n", " \"\"\"\n", " xml_files = glob.glob(os.path.join(directory_path, \"*.xml\"))\n", " results = []\n", " \n", " for xml_file in xml_files:\n", " try:\n", " tree = ET.parse(xml_file)\n", " root = tree.getroot()\n", " \n", " file_texts = []\n", " \n", " for segment in root.findall('segment'):\n", " text_element = segment.find('text')\n", " if text_element is not None and text_element.text:\n", " # Remove any part of speech details or metadata included in square brackets\n", " raw_text = text_element.text\n", " cleaned_text = re.sub(r\"\\[.*?\\]\", \"\", raw_text)\n", " # Fix some weirdness with non breaking spaces\n", " cleaned_text = cleaned_text.replace('\\xa0', '').replace('\\n', '')\n", " file_texts.append(cleaned_text.strip())\n", " \n", " if file_texts and len(file_texts) >= 2:\n", " results.append(file_texts)\n", " \n", " except ET.ParseError as e:\n", " print(f\"Error parsing {xml_file}: {e}\")\n", " except Exception as e:\n", " print(f\"Error processing {xml_file}: {e}\")\n", " \n", " return results" ] }, { "cell_type": "code", "execution_count": 6, "id": "f57c09c2-70e7-414b-a9ce-b6fc6419553d", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Parsed 244\n" ] } ], "source": [ "parsed = parse_xml_files(\"Datasets/ATCO2-ASRdataset-v1_beta/DATA\")\n", "print(f\"Parsed {len(parsed)}\")" ] }, { "cell_type": "code", "execution_count": 7, "id": "192dc7c6-7e19-4958-8673-4999a3a02282", "metadata": {}, "outputs": [], "source": [ "# Llama 3 prompt template\n", "def format_llama(instruction: str, first_message: str, reply: str):\n", " instruction = f\"\"\"<|begin_of_text|><|start_header_id|>system<|end_header_id|>\n", "{instruction}\n", "<|eot_id|><|start_header_id|>user<|end_header_id|>\n", "{first_message}\n", "<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n", "{reply}\"\"\"\n", " return instruction.format(first_message, reply)\n", "\n", "# Format for our saved json format\n", "def format_json(first_message: str, reply: str):\n", " return {\n", " \"instruction\": \"You are a helpful controller who responds to air traffic control messages.\",\n", " \"input\": first_message,\n", " \"output\": reply,\n", " }\n", "\n", "# Converts the saved json format to llama format for ingestion\n", "def json_to_llama(examples):\n", " instructions = examples[\"instruction\"]\n", " inputs = examples[\"input\"]\n", " outputs = examples[\"output\"]\n", " texts = []\n", " for instruction, input, output in zip(instructions, inputs, outputs):\n", " text = format_llama(instruction, input, output) + tokenizer.eos_token\n", " texts.append(text)\n", " return { \"text\" : texts, }" ] }, { "cell_type": "code", "execution_count": 8, "id": "45e61b34-82d6-434c-a3ca-c2a6e9ed6603", "metadata": {}, "outputs": [], "source": [ "import json\n", "\n", "# Grab 100 of the examples for evaluation\n", "messages_eval = []\n", "for message in parsed[0:100]:\n", " messages_eval.append(format_json(message[0], message[1]))\n", "\n", "# Save the dataset in our custom json format\n", "os.makedirs(\"Datasets\", exist_ok=True)\n", "with open(\"Datasets/dataset_eval.json\", 'w') as f:\n", " json.dump(messages_eval, f)" ] }, { "cell_type": "code", "execution_count": 9, "id": "64f08b0c-ff3a-44f4-982c-45528b71365b", "metadata": {}, "outputs": [], "source": [ "from datasets import Dataset\n", "\n", "def json_dataset(path: str):\n", " \"\"\"Create a dataset from a JSON file, used for the ATC dataset.\"\"\"\n", " with open(path, 'r') as f:\n", " data = json.load(f)\n", "\n", " return Dataset.from_list(data)\n", " \n", "def jsonl_dataset(path: str):\n", " \"\"\"Create a dataset from a JSONL file, used for synthetic data.\"\"\"\n", " lines = []\n", " with open(path, 'r') as f:\n", " for line in f:\n", " data = json.loads(line)\n", " lines.append(format_json(data[\"atc\"], data[\"response\"]))\n", "\n", " return Dataset.from_list(lines)" ] }, { "cell_type": "markdown", "id": "69fc4dfb-6f2e-4f71-bdb6-1bfa0193af99", "metadata": {}, "source": [ "## Evaluating the baseline model\n", "To evaluate the baseline results of the model we will use the HuggingFace transformers package and Unsloth for inference. We use two metrics here, **perplexity** and **BLEU**. Perplexity captures the \"surprise\" of the model, and applies on a per-token basis. BLEU is typically used for machine translation, but here is capturing if the response gets the gist of the correct answer, accounting for differences in word order." ] }, { "cell_type": "code", "execution_count": 10, "id": "314eca4b-9d67-4364-9f68-da9831cce117", "metadata": {}, "outputs": [], "source": [ "# This is where Model weights will be downloaded/used from\n", "cache_dir = \"Models\"" ] }, { "cell_type": "code", "execution_count": 11, "id": "87f62594-1ed3-4d15-9c95-3b52e25b5d03", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "🦥 Unsloth: Will patch your computer to enable 2x faster free finetuning.\n", "🦥 Unsloth Zoo will now patch everything to make training faster!\n", "INFO 07-11 18:16:50 [__init__.py:244] Automatically detected platform cuda.\n" ] } ], "source": [ "from unsloth import FastLanguageModel" ] }, { "cell_type": "code", "execution_count": 12, "id": "83c2fea5-2576-49d6-8c6c-75353ecd68ec", "metadata": {}, "outputs": [], "source": [ "import torch\n", "import torch.nn.functional as F\n", "from nltk.translate.bleu_score import sentence_bleu, SmoothingFunction\n", "\n", "def compute_bleu(reference: str, candidate: str) -> float:\n", " \"\"\"\n", " Compute BLEU score between reference and candidate strings.\n", "\n", " Args:\n", " reference: Ground-truth text.\n", " candidate: Generated text to evaluate.\n", "\n", " Returns:\n", " bleu_score: BLEU score (0 to 1).\n", " \"\"\"\n", " reference_tokens = reference.strip().split()\n", " candidate_tokens = candidate.strip().split()\n", "\n", " smoothie = SmoothingFunction().method4\n", " bleu_score = sentence_bleu(\n", " [reference_tokens],\n", " candidate_tokens,\n", " smoothing_function=smoothie\n", " )\n", " return bleu_score\n", "\n", "def compute_loss(model, tokenizer, prompt: str, target: str) -> float:\n", " \"\"\"\n", " Compute loss for a target response given a prompt.\n", "\n", " Args:\n", " model: Pretrained language model.\n", " tokenizer: Tokenizer for the model.\n", " prompt: Input text prompt.\n", " target: Ground-truth text continuation.\n", "\n", " Returns:\n", " loss: Computed loss value.\n", " \"\"\"\n", " # Tokenize separately to keep the prompt boundary\n", " prompt_ids = tokenizer(prompt, return_tensors=\"pt\").input_ids.to(model.device)\n", " target_ids = tokenizer(target, return_tensors=\"pt\").input_ids.to(model.device)\n", "\n", " # Create the combined input\n", " input_ids = torch.cat((prompt_ids, target_ids), dim=1)\n", "\n", " # Labels are the complete prompt and target response\n", " labels = input_ids.clone()\n", "\n", " # Set the tokens up to the end of the prompt to -100 to prevent loss computation there\n", " # This is because we don't care how the model predicts the prompt, just how well it\n", " # completes the text from the end of the prompt onwards\n", " prompt_len = prompt_ids.shape[1]\n", " labels[:, :prompt_len] = -100\n", "\n", " # Use the model to compute the loss\n", " with torch.no_grad():\n", " outputs = model(input_ids=input_ids, labels=labels)\n", " loss = outputs.loss\n", "\n", " # Perplexity is the exponentiated negative log-likelihood\n", " return loss.item()" ] }, { "cell_type": "code", "execution_count": 13, "id": "02570852-cf81-400a-874c-7a39be88313a", "metadata": {}, "outputs": [], "source": [ "from trl import SFTTrainer\n", "from transformers import TrainingArguments\n", "import torch\n", "\n", "def generate(model, tokenizer, text: str, max_new_tokens: int = 100) -> str:\n", " \"\"\"\n", " Generate text from model given an input prompt.\n", " \n", " Args:\n", " model: Pretrained language model.\n", " tokenizer: Corresponding tokenizer.\n", " text: Prompt text.\n", " max_new_tokens: Number of tokens to generate.\n", " \n", " Returns:\n", " str: Generated output text.\n", " \"\"\"\n", " inputs = tokenizer(text, return_tensors=\"pt\").to(model.device)\n", " input_ids = inputs[\"input_ids\"]\n", " \n", " outputs = model.generate(\n", " **inputs,\n", " max_new_tokens=max_new_tokens,\n", " temperature=0.7,\n", " use_cache=True\n", " )\n", " \n", " # Decode only the newly generated tokens (the part after the prompt)\n", " return tokenizer.decode(outputs[0][input_ids.shape[1]:], skip_special_tokens=True)" ] }, { "cell_type": "code", "execution_count": null, "id": "9581816b-11cb-4990-a34b-31793ed31ca5", "metadata": {}, "outputs": [], "source": [ "from tqdm.notebook import tqdm\n", "import numpy as np\n", "\n", "def evaluate(model, tokenizer, debug=False):\n", " \"\"\"\n", " This function loads the eval dataset and then loops over it to compute the\n", " metrics. Enable `debug` to show the text generated and the ground truth.\n", " \"\"\"\n", " # Load the dataset\n", " dataset = json_dataset(\"Datasets/dataset_eval.json\")\n", " \n", " # Compute Perplexity and BLEU scores\n", " losses, bleus = [], []\n", " \n", " for convo in tqdm(dataset, desc=\"Evaluating\"):\n", " prompt = format_llama(convo[\"instruction\"], convo[\"input\"], \"\")\n", " output = generate(model, tokenizer, prompt)\n", " ground_truth = convo[\"output\"]\n", "\n", " if debug:\n", " print(\"Input:\\n\", prompt)\n", " print(\"Output\\n\", output)\n", " print(\"GT\\n\", ground_truth)\n", " \n", " loss = compute_loss(model, tokenizer, output, ground_truth)\n", " bleu = compute_bleu(output, ground_truth)\n", " \n", " losses.append(loss)\n", " bleus.append(bleu)\n", " \n", " # Report metrics\n", " mean_loss = np.mean(loss)\n", " mean_bleu = np.mean(bleus)\n", " mean_ppl = np.exp(mean_loss)\n", " \n", " print(f\"\\n=== Evaluation Results ===\")\n", " print(f\"Average Perplexity: {mean_ppl:.2f}\")\n", " print(f\"Average BLEU Score: {mean_bleu:.2f}\")\n", "\n", " return mean_ppl, mean_bleu" ] }, { "cell_type": "code", "execution_count": 15, "id": "d37c6d11-a55e-4a88-b890-b0fc178ed69c", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "==((====))== Unsloth 2025.7.3: Fast Llama patching. Transformers: 4.53.2. vLLM: 0.9.2.\n", " \\\\ /| NVIDIA H100 80GB HBM3. Num GPUs = 1. Max memory: 79.209 GB. Platform: Linux.\n", "O^O/ \\_/ \\ Torch: 2.7.0+cu126. CUDA: 9.0. CUDA Toolkit: 12.6. Triton: 3.3.0\n", "\\ / Bfloat16 = TRUE. FA [Xformers = 0.0.30. FA2 = False]\n", " \"-____-\" Free license: http://github.com/unslothai/unsloth\n", "Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored!\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "426deee9742e4e5485233895cd175ea2", "version_major": 2, "version_minor": 0 }, "text/plain": [ "config.json: 0.00B [00:00, ?B/s]" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "1903e5cafed64c019b3485c7bf2b47be", "version_major": 2, "version_minor": 0 }, "text/plain": [ "model.safetensors: 0%| | 0.00/2.35G [00:00\n", " \n", " \n", " [250/250 00:50, Epoch 3/4]\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
StepTraining Loss
14.762000
24.686100
34.880100
44.702700
54.964900
64.541600
74.337800
84.433600
94.554600
104.621800
114.455400
124.431100
134.350000
144.214200
153.840500
164.140100
174.391500
183.875400
194.048800
203.957800
213.801900
223.897500
234.079000
243.890600
253.748000
263.964100
273.799400
283.737300
293.767900
303.581700
313.740300
323.673100
333.786100
343.637700
353.529000
363.500600
373.431700
383.717500
393.484600
403.530600
413.299400
423.246600
433.221300
443.216600
453.400700
463.295000
473.328800
483.212400
493.186700
503.111700
513.135700
523.061300
533.129500
542.812900
553.027100
562.946300
572.958200
582.732000
592.803700
602.888600
612.803900
622.687000
632.918200
642.666000
652.898900
662.530400
672.655500
682.520800
692.613300
702.581700
712.527300
722.625500
732.444100
742.388400
752.464300
762.569800
772.422900
782.323000
792.240800
802.399400
812.173600
822.413500
832.152700
842.108300
852.072800
862.102800
872.032800
882.071700
892.120400
902.062100
912.100300
922.098300
931.833700
941.849400
951.876600
961.950500
971.743500
981.921800
991.850400
1001.943800
1011.799600
1021.829700
1031.723000
1041.851800
1051.768400
1061.820100
1071.785700
1081.708200
1091.731400
1101.659000
1111.579200
1121.616000
1131.578700
1141.805600
1151.627700
1161.551300
1171.486400
1181.509400
1191.468300
1201.492500
1211.523300
1221.486100
1231.417800
1241.560400
1251.564300
1261.411400
1271.370100
1281.469700
1291.287900
1301.350700
1311.394000
1321.502800
1331.333300
1341.352500
1351.335000
1361.324200
1371.407700
1381.359600
1391.305500
1401.170300
1411.315400
1421.458400
1431.265300
1441.197200
1451.494000
1461.410200
1471.256400
1481.372300
1491.445100
1501.341300
1511.226100
1521.437600
1531.241700
1541.257800
1551.440200
1561.268700
1571.378500
1581.270300
1591.258500
1601.372400
1611.240800
1621.133500
1631.394800
1641.188500
1651.184400
1661.266000
1671.457400
1681.314500
1691.251400
1701.383400
1711.183600
1721.211000
1731.225000
1741.204000
1751.256200
1761.253400
1771.223100
1781.180300
1791.135800
1801.187200
1811.231800
1821.144100
1831.262200
1841.140800
1851.266800
1860.986200
1871.313600
1881.104600
1891.229700
1901.147400
1911.135100
1921.285700
1931.224500
1941.145700
1951.263500
1961.137600
1971.259100
1981.126000
1991.156700
2001.153400
2011.174400
2021.107700
2031.199500
2041.265000
2051.268700
2061.104300
2071.157800
2081.187900
2091.155200
2101.165400
2111.097800
2121.162000
2131.080000
2141.142100
2151.091300
2161.062000
2171.119800
2181.088700
2191.103000
2201.161300
2211.214800
2221.140900
2231.129000
2241.189400
2251.185300
2261.146400
2271.077500
2281.247100
2291.231900
2301.093400
2311.140400
2321.214400
2331.236600
2341.187500
2351.050100
2361.288500
2371.114800
2381.173000
2391.178500
2401.220100
2411.211500
2421.148000
2431.240400
2441.106200
2451.237700
2461.134400
2471.116100
2481.268500
2491.129200
2501.107700

" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Unsloth: Will smartly offload gradients to save VRAM!\n", "✅ Training complete! Model saved to Results\n" ] } ], "source": [ "print(\"🚀 Starting fine-tuning process...\")\n", "cache_dir = \"Models/\"\n", "\n", "# Load base model\n", "tuned_model, tuned_tokenizer = FastLanguageModel.from_pretrained(\n", " model_name=\"unsloth/Llama-3.2-3B-Instruct\",\n", " max_seq_length=2048,\n", " cache_dir=cache_dir,\n", ")\n", "\n", "# Format the dataset\n", "dataset = jsonl_dataset(\"data/train.jsonl\")\n", "dataset = dataset.map(json_to_llama, batched=True)\n", "\n", "# Add LoRA adapters for efficient fine-tuning\n", "tuned_model = FastLanguageModel.get_peft_model(\n", " tuned_model,\n", " r=16,\n", " target_modules=[\"q_proj\", \"k_proj\", \"v_proj\", \"o_proj\"],\n", " lora_alpha=16,\n", " lora_dropout=0,\n", " bias=\"none\",\n", " use_gradient_checkpointing=\"unsloth\",\n", ")\n", "\n", "# Set up training\n", "trainer = SFTTrainer(\n", " model=tuned_model,\n", " tokenizer=tuned_tokenizer,\n", " dataset_text_field=\"text\",\n", " train_dataset=dataset,\n", " max_seq_length=2048,\n", " dataset_num_proc=2,\n", " args=TrainingArguments(\n", " per_device_train_batch_size=8,\n", " gradient_accumulation_steps=1,\n", " warmup_steps=5,\n", " max_steps=250,\n", " learning_rate=2e-5,\n", " fp16=not torch.cuda.is_bf16_supported(),\n", " bf16=torch.cuda.is_bf16_supported(),\n", " logging_steps=1,\n", " optim=\"adamw_8bit\",\n", " weight_decay=0.01,\n", " lr_scheduler_type=\"linear\",\n", " seed=3407,\n", " output_dir=\"Results\",\n", " ),\n", ")\n", "\n", "print(\"🏋️ Training started...\")\n", "trainer.train()\n", "\n", "# Save the fine-tuned model\n", "tuned_model.save_pretrained(\"Results\")\n", "tuned_tokenizer.save_pretrained(\"Results\")\n", "\n", "print(\"✅ Training complete! Model saved to Results\")\n" ] }, { "cell_type": "markdown", "id": "b958845d-0651-4d32-9806-ef59df20cc00", "metadata": {}, "source": [ "## Evaluating the fine-tuned model\n", "Once we have a fine-tuned model, we can re-run our evaluation with the new model! We'll look at the metrics for both, as well as a \"vibe check\" where we manually inspect a few outputs to confirm the model is working how we expect. During evaluation, both metrics as well as manual spot checking are important -- metrics capture broad patterns and spot checking makes up for deficiencies in metrics." ] }, { "cell_type": "code", "execution_count": 18, "id": "0bd346a3-5abb-4774-bcb2-70f9a38bc3a6", "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "00ae0c4a78894ee6844411bb192a0af0", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Map: 0%| | 0/100 [00:00= max_examples:\n", " break" ] }, { "cell_type": "markdown", "id": "eb189949-fbc2-4c09-8724-4f04fc291d6d", "metadata": {}, "source": [ "## Conclusion\n", "By the end of this guide, you should have:\n", "\n", "* ✅ A running vLLM server with a quantized Llama model\n", "* ✅ Infrastructure to create synthetic examples for training\n", "* ✅ A 200+ example synthetic dataset created using Llama 4 Scout\n", "* ✅ A distilled Llama 3.1 8B model\n", "* ✅ Test results showing improved metrics and qualitative results\n", "\n", "What's next?\n", "\n", "* Use an even more powerful model to generate synthetic examples, for example Llama 4 Maverick\n", "* Develop more comprehensive evaluation strategies, including domain-specific metrics\n", "* Extend the dataset to include more data and thus better transfer knowledge\n", "* Examine your dataset using automated tools to understand what's inside and determine gaps" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.0" } }, "nbformat": 4, "nbformat_minor": 5 }