|
@@ -7,12 +7,12 @@
|
|
|
"source": [
|
|
|
"## This demo app shows:\n",
|
|
|
"* How to use LlamaIndex, an open source library to help you build custom data augmented LLM applications\n",
|
|
|
- "* How to ask Llama questions about recent live data via the You.com live search API and LlamaIndex\n",
|
|
|
+ "* How to ask Llama 3 questions about recent live data via the Tavily live search API\n",
|
|
|
"\n",
|
|
|
- "The LangChain package is used to facilitate the call to Llama2 hosted on OctoAI\n",
|
|
|
+ "The LangChain package is used to facilitate the call to Llama 3 hosted on OctoAI\n",
|
|
|
"\n",
|
|
|
"**Note** We will be using OctoAI to run the examples here. You will need to first sign into [OctoAI](https://octoai.cloud/) with your Github or Google account, then create a free API token [here](https://octo.ai/docs/getting-started/how-to-create-an-octoai-access-token) that you can use for a while (a month or $10 in OctoAI credits, whichever one runs out first).\n",
|
|
|
- "After the free trial ends, you will need to enter billing info to continue to use Llama2 hosted on OctoAI."
|
|
|
+ "After the free trial ends, you will need to enter billing info to continue to use Llama3 hosted on OctoAI."
|
|
|
]
|
|
|
},
|
|
|
{
|
|
@@ -32,23 +32,13 @@
|
|
|
"metadata": {},
|
|
|
"outputs": [],
|
|
|
"source": [
|
|
|
- "!pip install llama-index langchain"
|
|
|
- ]
|
|
|
- },
|
|
|
- {
|
|
|
- "cell_type": "code",
|
|
|
- "execution_count": null,
|
|
|
- "id": "21fe3849",
|
|
|
- "metadata": {},
|
|
|
- "outputs": [],
|
|
|
- "source": [
|
|
|
- "# use ServiceContext to configure the LLM used and the custom embeddings\n",
|
|
|
- "from llama_index import ServiceContext\n",
|
|
|
- "\n",
|
|
|
- "# VectorStoreIndex is used to index custom data \n",
|
|
|
- "from llama_index import VectorStoreIndex\n",
|
|
|
- "\n",
|
|
|
- "from langchain.llms.octoai_endpoint import OctoAIEndpoint"
|
|
|
+ "!pip install llama-index \n",
|
|
|
+ "!pip install llama-index-core\n",
|
|
|
+ "!pip install llama-index-llms-octoai\n",
|
|
|
+ "!pip install llama-index-embeddings-octoai\n",
|
|
|
+ "!pip install octoai-sdk\n",
|
|
|
+ "!pip install tavily-python\n",
|
|
|
+ "!pip install replicate"
|
|
|
]
|
|
|
},
|
|
|
{
|
|
@@ -75,227 +65,161 @@
|
|
|
},
|
|
|
{
|
|
|
"cell_type": "markdown",
|
|
|
- "id": "f8ff812b",
|
|
|
- "metadata": {},
|
|
|
- "source": [
|
|
|
- "In this example we will use the [YOU.com](https://you.com/) search engine to augment the LLM's responses.\n",
|
|
|
- "To use the You.com Search API, you can email api@you.com to request an API key. "
|
|
|
- ]
|
|
|
- },
|
|
|
- {
|
|
|
- "cell_type": "code",
|
|
|
- "execution_count": null,
|
|
|
- "id": "75275628-5235-4b55-8033-601c76107528",
|
|
|
- "metadata": {},
|
|
|
- "outputs": [],
|
|
|
- "source": [
|
|
|
- "YOUCOM_API_KEY = getpass()\n",
|
|
|
- "os.environ[\"YOUCOM_API_KEY\"] = YOUCOM_API_KEY"
|
|
|
- ]
|
|
|
- },
|
|
|
- {
|
|
|
- "cell_type": "markdown",
|
|
|
"id": "cb210c7c",
|
|
|
"metadata": {},
|
|
|
"source": [
|
|
|
- "We then call the Llama 2 model from OctoAI.\n",
|
|
|
+ "We then call the Llama 3 model from OctoAI.\n",
|
|
|
"\n",
|
|
|
- "We will use the Llama 2 13b chat FP16 model. You can find more on Llama 2 models on the [OctoAI text generation solution page](https://octoai.cloud/tools/text).\n",
|
|
|
+ "We will use the Llama 3 8b instruct model. You can find more on Llama models on the [OctoAI text generation solution page](https://octoai.cloud/text).\n",
|
|
|
"\n",
|
|
|
"At the time of writing this notebook the following Llama models are available on OctoAI:\n",
|
|
|
- "* llama-2-13b-chat\n",
|
|
|
- "* llama-2-70b-chat\n",
|
|
|
+ "* meta-llama-3-8b-instruct\n",
|
|
|
+ "* meta-llama-3-70b-instruct\n",
|
|
|
"* codellama-7b-instruct\n",
|
|
|
"* codellama-13b-instruct\n",
|
|
|
"* codellama-34b-instruct\n",
|
|
|
- "* codellama-70b-instruct"
|
|
|
+ "* llama-2-13b-chat\n",
|
|
|
+ "* llama-2-70b-chat\n",
|
|
|
+ "* llamaguard-7b"
|
|
|
]
|
|
|
},
|
|
|
{
|
|
|
"cell_type": "code",
|
|
|
"execution_count": null,
|
|
|
- "id": "c12fc2cb",
|
|
|
+ "id": "21fe3849",
|
|
|
"metadata": {},
|
|
|
"outputs": [],
|
|
|
"source": [
|
|
|
- "# set llm to be using Llama2 hosted on OctoAI\n",
|
|
|
- "llama2_13b = \"llama-2-13b-chat-fp16\"\n",
|
|
|
+ "# use ServiceContext to configure the LLM used and the custom embeddings\n",
|
|
|
+ "from llama_index.core import ServiceContext\n",
|
|
|
"\n",
|
|
|
- "llm = OctoAIEndpoint(\n",
|
|
|
- " endpoint_url=\"https://text.octoai.run/v1/chat/completions\",\n",
|
|
|
- " model_kwargs={\n",
|
|
|
- " \"model\": llama2_13b,\n",
|
|
|
- " \"messages\": [\n",
|
|
|
- " {\n",
|
|
|
- " \"role\": \"system\",\n",
|
|
|
- " \"content\": \"You are a helpful, respectful and honest assistant.\"\n",
|
|
|
- " }\n",
|
|
|
- " ],\n",
|
|
|
- " \"max_tokens\": 500,\n",
|
|
|
- " \"top_p\": 1,\n",
|
|
|
- " \"temperature\": 0.01\n",
|
|
|
- " },\n",
|
|
|
- ")"
|
|
|
+ "# VectorStoreIndex is used to index custom data \n",
|
|
|
+ "from llama_index.core import VectorStoreIndex\n",
|
|
|
+ "\n",
|
|
|
+ "from llama_index.core import Settings, VectorStoreIndex\n",
|
|
|
+ "from llama_index.embeddings.octoai import OctoAIEmbedding\n",
|
|
|
+ "from llama_index.llms.octoai import OctoAI\n",
|
|
|
+ "\n",
|
|
|
+ "Settings.llm = OctoAI(\n",
|
|
|
+ " model=\"meta-llama-3-8b-instruct\",\n",
|
|
|
+ " token=OCTOAI_API_TOKEN,\n",
|
|
|
+ " temperature=0.0,\n",
|
|
|
+ " max_tokens=128,\n",
|
|
|
+ ")\n",
|
|
|
+ "\n",
|
|
|
+ "Settings.embed_model = OctoAIEmbedding(api_key=OCTOAI_API_TOKEN)"
|
|
|
]
|
|
|
},
|
|
|
{
|
|
|
"cell_type": "markdown",
|
|
|
- "id": "476d72da",
|
|
|
+ "id": "f8ff812b",
|
|
|
"metadata": {},
|
|
|
"source": [
|
|
|
- "Using our api key we set up earlier, we make a request from YOU.com for live data on a particular topic."
|
|
|
+ "Next you will use the [Tavily](https://tavily.com/) search engine to augment the Llama 3's responses. To create a free trial Tavily Search API, sign in with your Google or Github account [here](https://app.tavily.com/sign-in)."
|
|
|
]
|
|
|
},
|
|
|
{
|
|
|
"cell_type": "code",
|
|
|
"execution_count": null,
|
|
|
- "id": "effc9656-b18d-4d24-a80b-6066564a838b",
|
|
|
+ "id": "75275628-5235-4b55-8033-601c76107528",
|
|
|
"metadata": {},
|
|
|
"outputs": [],
|
|
|
"source": [
|
|
|
- "import requests\n",
|
|
|
+ "from tavily import TavilyClient\n",
|
|
|
"\n",
|
|
|
- "query = \"Meta Connect\" # you can try other live data query about sports score, stock market and weather info \n",
|
|
|
- "headers = {\"X-API-Key\": os.environ[\"YOUCOM_API_KEY\"]}\n",
|
|
|
- "data = requests.get(\n",
|
|
|
- " f\"https://api.ydc-index.io/search?query={query}\",\n",
|
|
|
- " headers=headers,\n",
|
|
|
- ").json()"
|
|
|
+ "TAVILY_API_KEY = getpass()\n",
|
|
|
+ "tavily = TavilyClient(api_key=TAVILY_API_KEY)"
|
|
|
]
|
|
|
},
|
|
|
{
|
|
|
- "cell_type": "code",
|
|
|
- "execution_count": null,
|
|
|
- "id": "8bed3baf-742e-473c-ada1-4459012a8a2c",
|
|
|
+ "cell_type": "markdown",
|
|
|
+ "id": "476d72da",
|
|
|
"metadata": {},
|
|
|
- "outputs": [],
|
|
|
"source": [
|
|
|
- "# check the query result in JSON\n",
|
|
|
- "import json\n",
|
|
|
- "\n",
|
|
|
- "print(json.dumps(data, indent=2))"
|
|
|
+ "Do a live web search on \"Llama 3 fine-tuning\"."
|
|
|
]
|
|
|
},
|
|
|
{
|
|
|
- "cell_type": "markdown",
|
|
|
- "id": "b196e697",
|
|
|
+ "cell_type": "code",
|
|
|
+ "execution_count": null,
|
|
|
+ "id": "effc9656-b18d-4d24-a80b-6066564a838b",
|
|
|
"metadata": {},
|
|
|
+ "outputs": [],
|
|
|
"source": [
|
|
|
- "We then use the [`JSONLoader`](https://llamahub.ai/l/file-json) to extract the text from the returned data. The `JSONLoader` gives us the ability to load the data into LamaIndex.\n",
|
|
|
- "In the next cell we show how to load the JSON result with key info stored as \"snippets\".\n",
|
|
|
- "\n",
|
|
|
- "However, you can also add the snippets in the query result to documents like below:\n",
|
|
|
- "```python \n",
|
|
|
- "from llama_index import Document\n",
|
|
|
- "snippets = [snippet for hit in data[\"hits\"] for snippet in hit[\"snippets\"]]\n",
|
|
|
- "documents = [Document(text=s) for s in snippets]\n",
|
|
|
- "```\n",
|
|
|
- "This can be handy if you just need to add a list of text strings to doc"
|
|
|
+ "response = tavily.search(query=\"Llama 3 fine-tuning\")\n",
|
|
|
+ "context = [{\"url\": obj[\"url\"], \"content\": obj[\"content\"]} for obj in response['results']]"
|
|
|
]
|
|
|
},
|
|
|
{
|
|
|
"cell_type": "code",
|
|
|
"execution_count": null,
|
|
|
- "id": "7c40e73f-ca13-4f4a-a753-e613df3d389e",
|
|
|
+ "id": "6b5af98b-c26b-4fd7-8031-31ac4915cdac",
|
|
|
"metadata": {},
|
|
|
"outputs": [],
|
|
|
"source": [
|
|
|
- "# one way to load the JSON result with key info stored as \"snippets\"\n",
|
|
|
- "from llama_index import download_loader\n",
|
|
|
- "\n",
|
|
|
- "JsonDataReader = download_loader(\"JsonDataReader\")\n",
|
|
|
- "loader = JsonDataReader()\n",
|
|
|
- "documents = loader.load_data([hit[\"snippets\"] for hit in data[\"hits\"]])\n"
|
|
|
+ "context"
|
|
|
]
|
|
|
},
|
|
|
{
|
|
|
"cell_type": "markdown",
|
|
|
- "id": "8e5e3b4e",
|
|
|
+ "id": "0f4ea96b-bb00-4a1f-8bd2-7f15237415f6",
|
|
|
"metadata": {},
|
|
|
"source": [
|
|
|
- "With the data set up, we create a vector store for the data and a query engine for it.\n",
|
|
|
- "\n",
|
|
|
- "For our embeddings we will use `OctoAIEmbeddings` whose default embedding model is GTE-Large. This model provides a good balance between speed and performance.\n",
|
|
|
- "\n",
|
|
|
- "For more info see https://octoai.cloud/tools/text/embeddings?mode=demo&model=thenlper%2Fgte-large. "
|
|
|
+ "Create documents based on the search results, index and save them to a vector store, then create a query engine."
|
|
|
]
|
|
|
},
|
|
|
{
|
|
|
"cell_type": "code",
|
|
|
"execution_count": null,
|
|
|
- "id": "a5de3080-2c4b-479c-baba-793b3bee36ed",
|
|
|
+ "id": "7513ac70-155a-4d56-b326-0e8c2733ab99",
|
|
|
"metadata": {},
|
|
|
"outputs": [],
|
|
|
"source": [
|
|
|
- "# use OctoAI embeddings \n",
|
|
|
- "from langchain_community.embeddings import OctoAIEmbeddings\n",
|
|
|
- "from llama_index.embeddings import LangchainEmbedding\n",
|
|
|
- "\n",
|
|
|
- "\n",
|
|
|
- "embeddings = LangchainEmbedding(OctoAIEmbeddings(\n",
|
|
|
- " endpoint_url=\"https://text.octoai.run/v1/embeddings\"\n",
|
|
|
- "))\n",
|
|
|
- "print(embeddings)\n",
|
|
|
- "\n",
|
|
|
- "# create a ServiceContext instance to use Llama2 and custom embeddings\n",
|
|
|
- "service_context = ServiceContext.from_defaults(llm=llm, chunk_size=800, chunk_overlap=20, embed_model=embeddings)\n",
|
|
|
+ "from llama_index.core import Document\n",
|
|
|
"\n",
|
|
|
- "# create vector store index from the documents created above\n",
|
|
|
- "index = VectorStoreIndex.from_documents(documents, service_context=service_context)\n",
|
|
|
+ "documents = [Document(text=ct['content']) for ct in context]\n",
|
|
|
+ "index = VectorStoreIndex.from_documents(documents)\n",
|
|
|
"\n",
|
|
|
- "# create query engine from the index\n",
|
|
|
- "query_engine = index.as_query_engine(streaming=False)"
|
|
|
+ "query_engine = index.as_query_engine(streaming=True)"
|
|
|
]
|
|
|
},
|
|
|
{
|
|
|
"cell_type": "markdown",
|
|
|
- "id": "2c4ea012",
|
|
|
+ "id": "df743c62-165c-4834-b1f1-7d7848a6815e",
|
|
|
"metadata": {},
|
|
|
"source": [
|
|
|
- "We are now ready to ask Llama 2 a question about the live data using our query engine."
|
|
|
+ "You are now ready to ask Llama 3 questions about the live data using the query engine."
|
|
|
]
|
|
|
},
|
|
|
{
|
|
|
"cell_type": "code",
|
|
|
"execution_count": null,
|
|
|
- "id": "de91a191-d0f2-498e-88dc-b2b43423e0e5",
|
|
|
+ "id": "b2fd905b-575a-45f1-88da-9b093caa232a",
|
|
|
"metadata": {},
|
|
|
"outputs": [],
|
|
|
"source": [
|
|
|
- "# ask Llama2 a summary question about the search result\n",
|
|
|
"response = query_engine.query(\"give me a summary\")\n",
|
|
|
- "print(str(response))"
|
|
|
- ]
|
|
|
- },
|
|
|
- {
|
|
|
- "cell_type": "code",
|
|
|
- "execution_count": null,
|
|
|
- "id": "72814b20-06aa-4da8-b4dd-f0b0d74a2ea0",
|
|
|
- "metadata": {},
|
|
|
- "outputs": [],
|
|
|
- "source": [
|
|
|
- "# more questions\n",
|
|
|
- "print(str(query_engine.query(\"what products were announced\")))"
|
|
|
+ "response.print_response_stream()"
|
|
|
]
|
|
|
},
|
|
|
{
|
|
|
"cell_type": "code",
|
|
|
"execution_count": null,
|
|
|
- "id": "a65bc037-a689-476d-b529-0059a27bc949",
|
|
|
+ "id": "88c45380-1d00-46d5-80ac-0eff68fd1f8a",
|
|
|
"metadata": {},
|
|
|
"outputs": [],
|
|
|
"source": [
|
|
|
- "print(str(query_engine.query(\"tell me more about Meta AI assistant\")))"
|
|
|
+ "query_engine.query(\"what's the latest about Llama 3 fine-tuning?\").print_response_stream()"
|
|
|
]
|
|
|
},
|
|
|
{
|
|
|
"cell_type": "code",
|
|
|
"execution_count": null,
|
|
|
- "id": "16a56542",
|
|
|
+ "id": "0fe54976-5345-4426-a6f0-dc3bfd45dac3",
|
|
|
"metadata": {},
|
|
|
"outputs": [],
|
|
|
"source": [
|
|
|
- "print(str(query_engine.query(\"what are Generative AI stickers\")))"
|
|
|
+ "query_engine.query(\"tell me more about Llama 3 fine-tuning\").print_response_stream()"
|
|
|
]
|
|
|
}
|
|
|
],
|