{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "30eb1704-8d76-4bc9-9308-93243aeb69cb",
   "metadata": {},
   "source": [
    "## This demo app shows:\n",
    "* How to use LlamaIndex, an open source library to help you build custom data augmented LLM applications\n",
    "* How to ask Llama 3 questions about recent live data via the Tavily live search API\n",
    "\n",
    "The LangChain package is used to facilitate the call to Llama 3 hosted on OctoAI\n",
    "\n",
    "**Note** We will be using OctoAI to run the examples here. You will need to first sign into [OctoAI](https://octoai.cloud/) with your Github or Google account, then create a free API token [here](https://octo.ai/docs/getting-started/how-to-create-an-octoai-access-token) that you can use for a while (a month or $10 in OctoAI credits, whichever one runs out first).\n",
    "After the free trial ends, you will need to enter billing info to continue to use Llama3 hosted on OctoAI."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "68cf076e",
   "metadata": {},
   "source": [
    "We start by installing the necessary packages:\n",
    "- [langchain](https://python.langchain.com/docs/get_started/introduction) which provides RAG capabilities\n",
    "- [llama-index](https://docs.llamaindex.ai/en/stable/) for data augmentation."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "1d0005d6-e928-4d1a-981b-534a40e19e56",
   "metadata": {},
   "outputs": [],
   "source": [
    "!pip install llama-index \n",
    "!pip install llama-index-core\n",
    "!pip install llama-index-llms-octoai\n",
    "!pip install llama-index-embeddings-octoai\n",
    "!pip install octoai-sdk\n",
    "!pip install tavily-python\n",
    "!pip install replicate"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "73e8e661",
   "metadata": {},
   "source": [
    "Next we set up the OctoAI token."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d9d76e33",
   "metadata": {},
   "outputs": [],
   "source": [
    "from getpass import getpass\n",
    "import os\n",
    "\n",
    "OCTOAI_API_TOKEN = getpass()\n",
    "os.environ[\"OCTOAI_API_TOKEN\"] = OCTOAI_API_TOKEN"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "cb210c7c",
   "metadata": {},
   "source": [
    "We then call the Llama 3 model from OctoAI.\n",
    "\n",
    "We will use the Llama 3 8b instruct model. You can find more on Llama models on the [OctoAI text generation solution page](https://octoai.cloud/text).\n",
    "\n",
    "At the time of writing this notebook the following Llama models are available on OctoAI:\n",
    "* meta-llama-3-8b-instruct\n",
    "* meta-llama-3-70b-instruct\n",
    "* codellama-7b-instruct\n",
    "* codellama-13b-instruct\n",
    "* codellama-34b-instruct\n",
    "* llama-2-13b-chat\n",
    "* llama-2-70b-chat\n",
    "* llamaguard-7b"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "21fe3849",
   "metadata": {},
   "outputs": [],
   "source": [
    "# use ServiceContext to configure the LLM used and the custom embeddings\n",
    "from llama_index.core import ServiceContext\n",
    "\n",
    "# VectorStoreIndex is used to index custom data \n",
    "from llama_index.core import VectorStoreIndex\n",
    "\n",
    "from llama_index.core import Settings, VectorStoreIndex\n",
    "from llama_index.embeddings.octoai import OctoAIEmbedding\n",
    "from llama_index.llms.octoai import OctoAI\n",
    "\n",
    "Settings.llm = OctoAI(\n",
    "    model=\"meta-llama-3-8b-instruct\",\n",
    "    token=OCTOAI_API_TOKEN,\n",
    "    temperature=0.0,\n",
    "    max_tokens=128,\n",
    ")\n",
    "\n",
    "Settings.embed_model = OctoAIEmbedding(api_key=OCTOAI_API_TOKEN)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f8ff812b",
   "metadata": {},
   "source": [
    "Next you will use the [Tavily](https://tavily.com/) search engine to augment the Llama 3's responses. To create a free trial Tavily Search API, sign in with your Google or Github account [here](https://app.tavily.com/sign-in)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "75275628-5235-4b55-8033-601c76107528",
   "metadata": {},
   "outputs": [],
   "source": [
    "from tavily import TavilyClient\n",
    "\n",
    "TAVILY_API_KEY = getpass()\n",
    "tavily = TavilyClient(api_key=TAVILY_API_KEY)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "476d72da",
   "metadata": {},
   "source": [
    "Do a live web search on \"Llama 3 fine-tuning\"."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "effc9656-b18d-4d24-a80b-6066564a838b",
   "metadata": {},
   "outputs": [],
   "source": [
    "response = tavily.search(query=\"Llama 3 fine-tuning\")\n",
    "context = [{\"url\": obj[\"url\"], \"content\": obj[\"content\"]} for obj in response['results']]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "6b5af98b-c26b-4fd7-8031-31ac4915cdac",
   "metadata": {},
   "outputs": [],
   "source": [
    "context"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0f4ea96b-bb00-4a1f-8bd2-7f15237415f6",
   "metadata": {},
   "source": [
    "Create documents based on the search results, index and save them to a vector store, then create a query engine."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "7513ac70-155a-4d56-b326-0e8c2733ab99",
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.core import Document\n",
    "\n",
    "documents = [Document(text=ct['content']) for ct in context]\n",
    "index = VectorStoreIndex.from_documents(documents)\n",
    "\n",
    "query_engine = index.as_query_engine(streaming=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "df743c62-165c-4834-b1f1-7d7848a6815e",
   "metadata": {},
   "source": [
    "You are now ready to ask Llama 3 questions about the live data using the query engine."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b2fd905b-575a-45f1-88da-9b093caa232a",
   "metadata": {},
   "outputs": [],
   "source": [
    "response = query_engine.query(\"give me a summary\")\n",
    "response.print_response_stream()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "88c45380-1d00-46d5-80ac-0eff68fd1f8a",
   "metadata": {},
   "outputs": [],
   "source": [
    "query_engine.query(\"what's the latest about Llama 3 fine-tuning?\").print_response_stream()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "0fe54976-5345-4426-a6f0-dc3bfd45dac3",
   "metadata": {},
   "outputs": [],
   "source": [
    "query_engine.query(\"tell me more about Llama 3 fine-tuning\").print_response_stream()"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.6"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}