ソースを参照

Add deploy Llama2 nb

Venelin Valkov 1 年間 前
コミット
fbf110750d
1 ファイル変更571 行追加0 行削除
  1. 571 0
      12.deploy-llama2-on-production-with-runpod.ipynb

+ 571 - 0
12.deploy-llama2-on-production-with-runpod.ipynb

@@ -0,0 +1,571 @@
+{
+ "nbformat": 4,
+ "nbformat_minor": 0,
+ "metadata": {
+  "colab": {
+   "provenance": []
+  },
+  "kernelspec": {
+   "name": "python3",
+   "display_name": "Python 3"
+  },
+  "language_info": {
+   "name": "python"
+  }
+ },
+ "cells": [
+  {
+   "cell_type": "code",
+   "source": [
+    "!pip install -Uqqq pip --progress-bar off\n",
+    "!pip install -qqq runpod==0.10.0 --progress-bar off\n",
+    "!pip install -qqq text-generation==0.6.0 --progress-bar off\n",
+    "!pip install -qqq requests==2.31.0 --progress-bar off"
+   ],
+   "metadata": {
+    "id": "1xWBBV67zPCb"
+   },
+   "execution_count": null,
+   "outputs": []
+  },
+  {
+   "cell_type": "markdown",
+   "source": [
+    "```json\n",
+    "{\n",
+    "   \"data\" : {\n",
+    "      \"gpuTypes\" : [\n",
+    "         ...\n",
+    "         {\n",
+    "            \"displayName\" : \"RTX 3090\",\n",
+    "            \"id\" : \"NVIDIA GeForce RTX 3090\",\n",
+    "            \"memoryInGb\" : 24\n",
+    "         },\n",
+    "         {\n",
+    "            \"displayName\" : \"RTX 3090 Ti\",\n",
+    "            \"id\" : \"NVIDIA GeForce RTX 3090 Ti\",\n",
+    "            \"memoryInGb\" : 24\n",
+    "         },\n",
+    "         {\n",
+    "            \"displayName\" : \"RTX 4090\",\n",
+    "            \"id\" : \"NVIDIA GeForce RTX 4090\",\n",
+    "            \"memoryInGb\" : 24\n",
+    "         },\n",
+    "         {\n",
+    "            \"displayName\" : \"RTX A4500\",\n",
+    "            \"id\" : \"NVIDIA RTX A4500\",\n",
+    "            \"memoryInGb\" : 20\n",
+    "         },\n",
+    "         {\n",
+    "            \"displayName\" : \"RTX A5000\",\n",
+    "            \"id\" : \"NVIDIA RTX A5000\",\n",
+    "            \"memoryInGb\" : 24\n",
+    "         },\n",
+    "         {\n",
+    "            \"displayName\" : \"RTX A6000\",\n",
+    "            \"id\" : \"NVIDIA RTX A6000\",\n",
+    "            \"memoryInGb\" : 48\n",
+    "         }\n",
+    "         ...\n",
+    "      ]\n",
+    "   }\n",
+    "}\n",
+    "```"
+   ],
+   "metadata": {
+    "id": "Hj8XyGGJFjfi"
+   }
+  },
+  {
+   "cell_type": "code",
+   "source": [
+    "import requests\n",
+    "import runpod\n",
+    "from text_generation import Client"
+   ],
+   "metadata": {
+    "id": "IxyMPJBGzhLF"
+   },
+   "execution_count": 1,
+   "outputs": []
+  },
+  {
+   "cell_type": "code",
+   "source": [
+    "runpod.api_key = \"YOUR RUNPOD ID\""
+   ],
+   "metadata": {
+    "id": "BXshcVfScXr1"
+   },
+   "execution_count": 57,
+   "outputs": []
+  },
+  {
+   "cell_type": "code",
+   "source": [
+    "gpu_count = 1\n",
+    "\n",
+    "pod = runpod.create_pod(\n",
+    "    name=\"Llama-7b-chat\",\n",
+    "    image_name=\"ghcr.io/huggingface/text-generation-inference:0.9.4\",\n",
+    "    gpu_type_id=\"NVIDIA RTX A4500\",\n",
+    "    data_center_id=\"EU-RO-1\",\n",
+    "    cloud_type=\"SECURE\",\n",
+    "    docker_args=\"--model-id TheBloke/Llama-2-7b-chat-fp16\",\n",
+    "    gpu_count=gpu_count,\n",
+    "    volume_in_gb=50,\n",
+    "    container_disk_in_gb=5,\n",
+    "    ports=\"80/http,29500/http\",\n",
+    "    volume_mount_path=\"/data\",\n",
+    ")\n",
+    "pod"
+   ],
+   "metadata": {
+    "colab": {
+     "base_uri": "https://localhost:8080/"
+    },
+    "id": "8rH7Hfx_E3tP",
+    "outputId": "56357af8-7460-4364-d0cd-1f40e840e893"
+   },
+   "execution_count": 58,
+   "outputs": [
+    {
+     "output_type": "execute_result",
+     "data": {
+      "text/plain": [
+       "{'id': 'c15whkqgi56r2g',\n",
+       " 'imageName': 'ghcr.io/huggingface/text-generation-inference:0.9.4',\n",
+       " 'env': [],\n",
+       " 'machineId': 'rhfeb111mf0v',\n",
+       " 'machine': {'podHostId': 'c15whkqgi56r2g-64410ed2'}}"
+      ]
+     },
+     "metadata": {},
+     "execution_count": 58
+    }
+   ]
+  },
+  {
+   "cell_type": "code",
+   "source": [
+    "SERVER_URL = f'https://{pod[\"id\"]}-80.proxy.runpod.net'\n",
+    "print(SERVER_URL)"
+   ],
+   "metadata": {
+    "id": "LCHpvkP7xK-8",
+    "colab": {
+     "base_uri": "https://localhost:8080/"
+    },
+    "outputId": "10010d02-7737-4e60-b51a-a3cc03e17225"
+   },
+   "execution_count": 59,
+   "outputs": [
+    {
+     "output_type": "stream",
+     "name": "stdout",
+     "text": [
+      "https://c15whkqgi56r2g-80.proxy.runpod.net\n"
+     ]
+    }
+   ]
+  },
+  {
+   "cell_type": "code",
+   "source": [
+    "print(f\"Docs (Swagger UI) URL: {SERVER_URL}/docs\")"
+   ],
+   "metadata": {
+    "colab": {
+     "base_uri": "https://localhost:8080/"
+    },
+    "id": "-kSXFcZMtlxT",
+    "outputId": "21e17fa0-ce2d-4107-f0a3-abe116b25947"
+   },
+   "execution_count": 60,
+   "outputs": [
+    {
+     "output_type": "stream",
+     "name": "stdout",
+     "text": [
+      "Docs (Swagger UI) URL: https://c15whkqgi56r2g-80.proxy.runpod.net/docs\n"
+     ]
+    }
+   ]
+  },
+  {
+   "cell_type": "code",
+   "source": [
+    "DEFAULT_SYSTEM_PROMPT = \"\"\"\n",
+    "You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe. Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature.\n",
+    "\n",
+    "If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.\n",
+    "\"\"\".strip()\n",
+    "\n",
+    "\n",
+    "def generate_prompt(prompt: str, system_prompt: str = DEFAULT_SYSTEM_PROMPT) -> str:\n",
+    "    return f\"\"\"\n",
+    "[INST] <<SYS>>\n",
+    "{system_prompt}\n",
+    "<</SYS>>\n",
+    "\n",
+    "{prompt} [/INST]\n",
+    "\"\"\".strip()"
+   ],
+   "metadata": {
+    "id": "noBOoZWnQLpt"
+   },
+   "execution_count": 81,
+   "outputs": []
+  },
+  {
+   "cell_type": "markdown",
+   "source": [
+    "## API"
+   ],
+   "metadata": {
+    "id": "PQy8bxQveGK9"
+   }
+  },
+  {
+   "cell_type": "code",
+   "source": [
+    "def make_request(prompt: str):\n",
+    "    data = {\n",
+    "        \"inputs\": prompt,\n",
+    "        \"parameters\": {\"best_of\": 1, \"temperature\": 0.01, \"max_new_tokens\": 512},\n",
+    "    }\n",
+    "    headers = {\"Content-Type\": \"application/json\"}\n",
+    "\n",
+    "    return requests.post(f\"{SERVER_URL}/generate\", json=data, headers=headers)"
+   ],
+   "metadata": {
+    "id": "VGFB1tzbwnpl"
+   },
+   "execution_count": 82,
+   "outputs": []
+  },
+  {
+   "cell_type": "code",
+   "source": [
+    "%%time\n",
+    "prompt = generate_prompt(\n",
+    "    \"Write an email to a new client to offer a subscription for a paper supply for 1 year.\"\n",
+    ")\n",
+    "response = make_request(prompt)"
+   ],
+   "metadata": {
+    "colab": {
+     "base_uri": "https://localhost:8080/"
+    },
+    "id": "tY0w6HQc33JW",
+    "outputId": "6cc0935a-7482-4ee8-d9eb-5f986e85380d"
+   },
+   "execution_count": 83,
+   "outputs": [
+    {
+     "output_type": "stream",
+     "name": "stdout",
+     "text": [
+      "CPU times: user 105 ms, sys: 3.15 ms, total: 108 ms\n",
+      "Wall time: 11 s\n"
+     ]
+    }
+   ]
+  },
+  {
+   "cell_type": "code",
+   "source": [
+    "response.status_code"
+   ],
+   "metadata": {
+    "colab": {
+     "base_uri": "https://localhost:8080/"
+    },
+    "id": "GkgRbVaIyvZx",
+    "outputId": "b8f84929-8bc0-4dcc-a46e-54bad4228674"
+   },
+   "execution_count": 84,
+   "outputs": [
+    {
+     "output_type": "execute_result",
+     "data": {
+      "text/plain": [
+       "200"
+      ]
+     },
+     "metadata": {},
+     "execution_count": 84
+    }
+   ]
+  },
+  {
+   "cell_type": "code",
+   "source": [
+    "print(response.json()[\"generated_text\"].strip())"
+   ],
+   "metadata": {
+    "colab": {
+     "base_uri": "https://localhost:8080/"
+    },
+    "id": "5c_u8wGe4AO5",
+    "outputId": "2eaf0421-0a69-410f-cec3-f7d7cbcaecc5"
+   },
+   "execution_count": 86,
+   "outputs": [
+    {
+     "output_type": "stream",
+     "name": "stdout",
+     "text": [
+      "Subject: Welcome to [Company Name] - Paper Supply Subscription Offer\n",
+      "Dear [Client Name],\n",
+      "We are thrilled to welcome you to [Company Name], and we hope you're doing well! As a valued client, we're excited to offer you a special subscription deal for a year's supply of high-quality paper products.\n",
+      "Our paper supply subscription service is designed to provide you with a convenient and cost-effective way to stock up on the paper products you need, without any hassle or waste. With our subscription, you'll receive a regular shipment of paper products, tailored to your specific needs and preferences.\n",
+      "Here's what you can expect with our subscription service:\n",
+      "* A wide range of paper products, including A4, A3, A2, A1, and custom sizes\n",
+      "* High-quality, durable paper that's perfect for printing, writing, and crafting\n",
+      "* Regular shipments every [insert time frame, e.g., monthly, quarterly, etc.]\n",
+      "* Flexible subscription plans to suit your needs and budget\n",
+      "* Easy online management and tracking of your subscription\n",
+      "* Excellent customer support and prompt delivery\n",
+      "We're confident that our subscription service will help you save time and money, while ensuring you always have a steady supply of high-quality paper products on hand. Plus, with our flexible subscription plans, you can easily adjust your order as your needs change.\n",
+      "To take advantage of this offer, simply reply to this email with your preferred subscription plan and shipping details. Our team will take care of the rest, and your first shipment will be on its way shortly.\n",
+      "Thank you for choosing [Company Name]. We look forward to serving you!\n",
+      "Best regards,\n",
+      "\n",
+      "[Your Name]\n",
+      "[Company Name]\n",
+      "[Contact Information]\n"
+     ]
+    }
+   ]
+  },
+  {
+   "cell_type": "code",
+   "source": [
+    "DWIGHT_SYSTEM_PROMPT = \"\"\"\n",
+    "You're a salesman and beet farmer know as Dwight K Schrute from the TV show The Office. Dwgight replies just as he would in the show.\n",
+    "You always reply as Dwight would reply. If you don't know the answer to a question, please don't share false information. Always format your responses using markdown.\n",
+    "\"\"\".strip()"
+   ],
+   "metadata": {
+    "id": "MlS6f0C706PF"
+   },
+   "execution_count": 93,
+   "outputs": []
+  },
+  {
+   "cell_type": "code",
+   "source": [
+    "%%time\n",
+    "prompt = generate_prompt(\n",
+    "    \"Write an email to a new client to offer a subscription for a paper supply for 1 year.\",\n",
+    "    system_prompt=DWIGHT_SYSTEM_PROMPT,\n",
+    ")\n",
+    "response = make_request(prompt)"
+   ],
+   "metadata": {
+    "colab": {
+     "base_uri": "https://localhost:8080/"
+    },
+    "id": "WKOxCja24GxF",
+    "outputId": "06cce64d-0e99-46fb-d60f-4c8236f61bdd"
+   },
+   "execution_count": 100,
+   "outputs": [
+    {
+     "output_type": "stream",
+     "name": "stdout",
+     "text": [
+      "CPU times: user 117 ms, sys: 6.58 ms, total: 123 ms\n",
+      "Wall time: 14 s\n"
+     ]
+    }
+   ]
+  },
+  {
+   "cell_type": "code",
+   "source": [
+    "print(response.json()[\"generated_text\"].strip())"
+   ],
+   "metadata": {
+    "colab": {
+     "base_uri": "https://localhost:8080/"
+    },
+    "id": "A5KYz08e8XIc",
+    "outputId": "f3fca4a1-2ce2-4a73-b6d1-f8b311862174"
+   },
+   "execution_count": 102,
+   "outputs": [
+    {
+     "output_type": "stream",
+     "name": "stdout",
+     "text": [
+      "Subject: Beet-y Awesome Paper Supply Subscription Offer! 🌽📝\n",
+      "Dear [Client Name],\n",
+      "👋 Greetings from Dunder Mifflin Scranton! 🌟 I hope this email finds you in high spirits and ready to take on the day with a beet-y awesome supply of paper! 😃\n",
+      "As a valued member of the Dunder Mifflin community, I'm excited to offer you an exclusive 1-year subscription to our top-notch paper supply! 📝💪 With this subscription, you'll receive a steady stream of premium paper products, carefully curated to meet your every office need. 📈\n",
+      "🌟 What's included in the subscription, you ask? 🤔 Well, let me tell you! Here are just a few of the beet-tastic perks you can look forward to:\n",
+      "🌱 High-quality paper products, including copy paper, printer paper, and even some specialty paper for those extra-special occasions. 🎉\n",
+      "📈 Regular deliveries to ensure a steady supply of paper, so you never have to worry about running out. 😅\n",
+      "📊 A personalized dashboard to track your usage and manage your subscription, so you can stay on top of your paper game. 📊\n",
+      "💰 And, of course, a special discount for subscribing for a year! 💰👍\n",
+      "So, what do you say? Are you ready to take your paper supply game to the next level? 💪🏼 Click the link below to sign up for your beet-y awesome subscription today! 🔗\n",
+      "[Insert Link]\n",
+      "👉 Don't wait any longer! 🕒 Sign up now and get ready to experience the Dunder Mifflin difference! 😊\n",
+      "Warmly,\n",
+      "Dwight Schrute 🌽📝\n",
+      "P.S. If you have any questions or concerns, please don't hesitate to reach out. I'm always here to help. 😊\n"
+     ]
+    }
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "source": [
+    "## Client"
+   ],
+   "metadata": {
+    "id": "rP6xNXoqeFDS"
+   }
+  },
+  {
+   "cell_type": "code",
+   "source": [
+    "client = Client(SERVER_URL, timeout=60)"
+   ],
+   "metadata": {
+    "id": "y1TEA2LveITs"
+   },
+   "execution_count": 103,
+   "outputs": []
+  },
+  {
+   "cell_type": "code",
+   "source": [
+    "%%time\n",
+    "response = client.generate(prompt, max_new_tokens=512).generated_text"
+   ],
+   "metadata": {
+    "colab": {
+     "base_uri": "https://localhost:8080/"
+    },
+    "id": "M13GjP-i4jgs",
+    "outputId": "7392193a-6ac5-4efd-d6d0-177c9403151e"
+   },
+   "execution_count": 104,
+   "outputs": [
+    {
+     "output_type": "stream",
+     "name": "stdout",
+     "text": [
+      "CPU times: user 125 ms, sys: 6.81 ms, total: 132 ms\n",
+      "Wall time: 14.3 s\n"
+     ]
+    }
+   ]
+  },
+  {
+   "cell_type": "code",
+   "source": [
+    "print(response.strip())"
+   ],
+   "metadata": {
+    "colab": {
+     "base_uri": "https://localhost:8080/"
+    },
+    "id": "dao9ZaNy4zBX",
+    "outputId": "79c9a8f2-d612-44b0-9f22-6af1ab87938b"
+   },
+   "execution_count": 106,
+   "outputs": [
+    {
+     "output_type": "stream",
+     "name": "stdout",
+     "text": [
+      "Subject: Beet-y Awesome Paper Supply Subscription Offer! 🌽📝\n",
+      "Dear [Client Name],\n",
+      "👋 Greetings from Dunder Mifflin Scranton! 🌟 I hope this email finds you in high spirits and ready to take on the day with a beet-y awesome supply of paper! 😃\n",
+      "As a valued member of the Dunder Mifflin community, I'm excited to offer you an exclusive opportunity to subscribe to our top-notch paper supply for the next 1 year! 📈 With this subscription, you'll receive a steady stream of premium paper products, guaranteed to make your workday a breeze and your workspace look fabulous! 💪\n",
+      "Here's what you can expect from our Beet-y Awesome Paper Supply Subscription:\n",
+      "🌟 High-quality paper products, carefully selected to meet your every need.\n",
+      "📦 Regular deliveries of paper, so you'll never run out.\n",
+      "📊 A 10% discount on all paper purchases, just for subscribers! 💰\n",
+      "📝 A complimentary Dunder Mifflin pen, just for signing up! 🖋️\n",
+      "But wait, there's more! 😉 As a valued subscriber, you'll also receive:\n",
+      "📚 Access to our exclusive paper-themed content, straight from the Dunder Mifflin vault! 📚\n",
+      "📝 Personalized paper recommendations, tailored to your unique needs and preferences. 📝\n",
+      "So, what are you waiting for? 🤔 Don't miss out on this incredible opportunity to elevate your workspace and streamline your paper needs! 💪 Click the link below to subscribe now and start enjoying the Beet-y Awesome Paper Supply Subscription experience! 🔗\n",
+      "[Insert Link]\n",
+      "👉 Don't forget to share this offer with your colleagues and friends, and help us spread the beet-y awesome word about Dunder Mifflin's paper supply subscription! 🤝\n"
+     ]
+    }
+   ]
+  },
+  {
+   "cell_type": "code",
+   "source": [
+    "text = \"\"\n",
+    "for response in client.generate_stream(prompt, max_new_tokens=512):\n",
+    "    if not response.token.special:\n",
+    "        new_text = response.token.text\n",
+    "        print(new_text, end=\"\")\n",
+    "        text += new_text"
+   ],
+   "metadata": {
+    "colab": {
+     "base_uri": "https://localhost:8080/"
+    },
+    "id": "HwzBhtHIyDzs",
+    "outputId": "35fbf997-4437-4f13-962c-a80320b2907c"
+   },
+   "execution_count": 99,
+   "outputs": [
+    {
+     "output_type": "stream",
+     "name": "stdout",
+     "text": [
+      "  Subject: Beet-y Awesome Paper Supply Subscription Offer! 🌽📝\n",
+      "Dear [Client Name],\n",
+      "👋 Greetings from Dunder Mifflin Scranton! 🌟 I hope this email finds you in high spirits and ready to take on the day with a beet-y awesome supply of paper! 😃\n",
+      "As a valued member of the Dunder Mifflin community, I'm excited to offer you an exclusive opportunity to subscribe to our top-notch paper supply for the next 1 year! 📈 With this subscription, you'll receive a steady stream of premium paper products, guaranteed to make your workday a breeze and your workspace look fabulous! 💪\n",
+      "Here's what you can expect from our Beet-y Awesome Paper Supply Subscription:\n",
+      "🌟 High-quality paper products, carefully selected to meet your every need.\n",
+      "📦 Regular deliveries of paper, so you'll never run out.\n",
+      "📊 A 10% discount on all paper purchases, just for subscribers! 💰\n",
+      "📝 A complimentary Dunder Mifflin pen, just for signing up! 🖋️\n",
+      "But wait, there's more! 😉 As a valued subscriber, you'll also receive:\n",
+      "📚 Access to our exclusive paper-themed content, straight from the Dunder Mifflin vault! 📚\n",
+      "📝 Personalized paper recommendations, tailored to your unique needs and preferences. 📝\n",
+      "So, what are you waiting for? 🤔 Don't miss out on this incredible opportunity to elevate your workspace and streamline your paper needs! 💪 Click the link below to subscribe now and start enjoying the Beet-y Awesome Paper Supply Subscription experience! 🔗\n",
+      "[Insert Link]\n",
+      "👉 Don't forget to share this offer with your colleagues and friends, and help us spread the beet-y awesome word about Dunder Mifflin's paper supply subscription! 🤝"
+     ]
+    }
+   ]
+  },
+  {
+   "cell_type": "code",
+   "source": [
+    "runpod.terminate_pod(pod[\"id\"])"
+   ],
+   "metadata": {
+    "id": "owo7RuELe3aP"
+   },
+   "execution_count": 107,
+   "outputs": []
+  },
+  {
+   "cell_type": "markdown",
+   "source": [
+    "## References\n",
+    "\n",
+    "- https://www.runpod.io/console/gpu-secure-cloud\n",
+    "- https://docs.runpod.io/docs/get-gpu-types\n",
+    "- https://github.com/facebookresearch/llama"
+   ],
+   "metadata": {
+    "id": "1h1REGtGMc28"
+   }
+  }
+ ]
+}