2 лет назад · 12ac7b44c8
--- a/recipes/use_cases/langchain/README.md
+++ b/recipes/use_cases/langchain/README.md
@@ -4,41 +4,47 @@ LLM agents use [planning, memory, and tools](https://lilianweng.github.io/posts/
 
				 
			
 
				 LangChain offers several different ways to implement agents.
			
 
				 
			
 
				-(1) Use [agent executor](https://python.langchain.com/docs/modules/agents/quick_start/) with [tool-calling](https://python.langchain.com/docs/integrations/chat/) versions of llama3.
			
 
				+(1) Use [AgentExecutor](https://python.langchain.com/docs/modules/agents/quick_start/) with [tool-calling](https://python.langchain.com/docs/integrations/chat/) versions of Llama 3.
			
 
				 
			
 
				-(2) Use [LangGraph](https://python.langchain.com/docs/langgraph), a library from LangChain that can be used to build reliable agents.
			
 
				+(2) Use [LangGraph](https://python.langchain.com/docs/langgraph), a library from LangChain that can be used to build reliable agents with Llama 3.
			
 
				 
			
 
				 ---
			
 
				 
			
 
				-### Agent Executor
			
 
				+### AgentExecutor Agent
			
 
				 
			
 
				-Our first notebook, `tool-calling-agent`, shows how to build a [tool calling agent](https://python.langchain.com/docs/modules/agents/agent_types/tool_calling/) with agent executor.
			
 
				+AgentExecutor is the runtime for an agent. AgentExecutor calls the agent, executes the actions it chooses, passes the action outputs back to the agent, and repeats.
			
 
				 
			
 
				-This show how to build an agent that uses web search and retrieval tools.
			
 
				+Our first notebook, `tool-calling-agent`, shows how to build a [tool calling agent](https://python.langchain.com/docs/modules/agents/agent_types/tool_calling/) with AgentExecutor and Llama 3.
			
 
				+
			
 
				+This shows how to build an agent that uses web search and retrieval tools.
			
 
				 
			
 
				 --- 
			
 
				 
			
 
				-### LangGraph
			
 
				+### LangGraph Agent
			
 
				 
			
 
				 [LangGraph](https://python.langchain.com/docs/langgraph) is a library from LangChain that can be used to build reliable agents.
			
 
				 
			
 
				 LangGraph can be used to build agents with a few pieces:
			
 
				 - **Planning:** Define a control flow of steps that you want the agent to take (a graph)
			
 
				 - **Memory:** Persist information (graph state) across these steps
			
 
				-- **Tool use:** Tools can be used at any step to modify state
			
 
				+- **Tool use:** Modify state at any step
			
 
				+
			
 
				+Our second notebook, `langgraph-agent`, shows how to build a Llama 3 powered agent that uses web search and retrieval tool in LangGraph.
			
 
				 
			
 
				-Our second notebook, `langgraph-agent`, shows how to build an agent that uses web search and retrieval tool in LangGraph.
			
 
				+It discusses some of the trade-offs between AgentExecutor and LangGraph.
			
 
				+
			
 
				+--- 
			
 
				 
			
 
				-It discusses some of the trade-offs between agent executor and LangGraph.
			
 
				+### LangGraph RAG Agent
			
 
				 
			
 
				-Our third notebook, `langgraph-rag-agent`, shows how to apply LangGraph to build advanced RAG agents that use ideas from 3 papers:
			
 
				+Our third notebook, `langgraph-rag-agent`, shows how to apply LangGraph to build advanced Llama 3 powered RAG agents that use ideas from 3 papers:
			
 
				 
			
 
				 * Corrective-RAG (CRAG) [paper](https://arxiv.org/pdf/2401.15884.pdf) uses self-grading on retrieved documents and web-search fallback if documents are not relevant.
			
 
				 * Self-RAG [paper](https://arxiv.org/abs/2310.11511) adds self-grading on generations for hallucinations and for ability to answer the question.
			
 
				 * Adaptive RAG [paper](https://arxiv.org/abs/2403.14403) routes queries between different RAG approaches based on their complexity.
			
 
				 
			
 
				 We implement each approach as a control flow in LangGraph:
			
 
				-- **Planning:** The sequence of RAG steps (e.g., retrieval, grading, generation) that we want the agent to take
			
 
				+- **Planning:** The sequence of RAG steps (e.g., retrieval, grading, and generation) that we want the agent to take
			
 
				 - **Memory:** All the RAG-related information (input question, retrieved documents, etc) that we want to pass between steps
			
 
				 - **Tool use:** All the tools needed for RAG (e.g., decide web search or vectorstore retrieval based on the question)
			
 
				 
			
@@ -46,6 +52,10 @@ We will build from CRAG (blue, below) to Self-RAG (green) and finally to Adaptiv
 
				 
			
 
				 ![Screenshot 2024-05-03 at 10 50 02 AM](https://github.com/rlancemartin/llama-recipes/assets/122662504/ec4aa1cd-3c7e-4cd1-a1e7-7deddc4033a8)
			
 
				 
			
 
				-Our fouth notebook, `langgraph-rag-agent-local`, shows how to apply LangGraph to build advanced RAG agents that run locally and reliable.
			
 
				+--- 
			
 
				+
			
 
				+### Local LangGraph RAG Agent
			
 
				+
			
 
				+Our fouth notebook, `langgraph-rag-agent-local`, shows how to apply LangGraph to build advanced RAG agents using Llama 3 that run locally and reliably.
			
 
				 
			
 
				 See this [video overview](https://www.youtube.com/watch?v=sgnrL7yo1TE) for more detail.
			
--- a/recipes/use_cases/langchain/langgraph-agent.ipynb
+++ b/recipes/use_cases/langchain/langgraph-agent.ipynb
--- a/recipes/use_cases/langchain/langgraph-rag-agent-local.ipynb
+++ b/recipes/use_cases/langchain/langgraph-rag-agent-local.ipynb
@@ -1,13 +1,21 @@
 
				 {
			
 
				  "cells": [
			
 
				   {
			
 
				+   "cell_type": "markdown",
			
 
				+   "id": "1f53f753-12c6-4fac-b910-6e96677d8a49",
			
 
				+   "metadata": {},
			
 
				+   "source": [
			
 
				+    "<a href=\"https://colab.research.google.com/github/meta-llama/llama-recipes/blob/main/recipes/use_cases/agents/langchain/langgraph-rag-agent-local.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
			
 
				+   ]
			
 
				+  },
			
 
				+  {
			
 
				    "cell_type": "code",
			
 
				    "execution_count": null,
			
 
				-   "id": "8520d840-fcf6-4458-b85c-8a2ff80a34eb",
			
 
				+   "id": "6b9ab14a-fd80-4ca2-afc5-efe1c39532bf",
			
 
				    "metadata": {},
			
 
				    "outputs": [],
			
 
				    "source": [
			
 
				-    "! pip install -U langchain-nomic langchain_community tiktoken langchainhub chromadb langchain langgraph tavily-python gpt4all"
			
 
				+    "! pip install -U langchain_community tiktoken langchainhub chromadb langchain langgraph tavily-python sentence-transformers"
			
 
				    ]
			
 
				   },
			
 
				   {
			
@@ -20,15 +28,15 @@
 
				    "id": "0216de30-29cf-4464-9cc3-6e9a6d6c3e40",
			
 
				    "metadata": {},
			
 
				    "source": [
			
 
				-    "# Local LangGraph RAG agent with LLaMA3\n",
			
 
				+    "# Local LangGraph RAG agent with Llama 3\n",
			
 
				     "\n",
			
 
				-    "Previously, we showed how to build simple agents with LangGraph and Llama3.\n",
			
 
				+    "Previously, we showed how to build simple agents with LangGraph and Llama 3.\n",
			
 
				     "\n",
			
 
				-    "Now, we'll pick a more advanced use-case: advanced RAG, with the requirment that it runs locally (on my laptop!).\n",
			
 
				+    "Now, we'll pick a more advanced use-case: advanced RAG, with the requirment that it runs locally.\n",
			
 
				     "\n",
			
 
				     "## Ideas\n",
			
 
				     "\n",
			
 
				-    "We'll combine ideas from paper RAG papers into a RAG agent:\n",
			
 
				+    "We'll combine ideas from three RAG papers into a RAG agent:\n",
			
 
				     "\n",
			
 
				     "- **Routing:**  Adaptive RAG ([paper](https://arxiv.org/abs/2403.14403)). Route questions to different retrieval approaches\n",
			
 
				     "- **Fallback:** Corrective RAG ([paper](https://arxiv.org/pdf/2401.15884.pdf)). Fallback to web search if docs are not relevant to query\n",
			
@@ -70,7 +78,7 @@
 
				     "### Tracing (optional)\n",
			
 
				     "os.environ['LANGCHAIN_TRACING_V2'] = 'true'\n",
			
 
				     "os.environ['LANGCHAIN_ENDPOINT'] = 'https://api.smith.langchain.com'\n",
			
 
				-    "os.environ['LANGCHAIN_API_KEY'] = <your-api-key>\n",
			
 
				+    "os.environ['LANGCHAIN_API_KEY'] = 'LANGCHAIN_API_KEY'\n",
			
 
				     "```\n",
			
 
				     "\n",
			
 
				     "### Search\n",
			
@@ -85,12 +93,18 @@
 
				    "metadata": {},
			
 
				    "outputs": [],
			
 
				    "source": [
			
 
				-    "os.environ['TAVILY_API_KEY'] = <your-api-key>"
			
 
				+    "import os\n",
			
 
				+    "\n",
			
 
				+    "os.environ['LANGCHAIN_TRACING_V2'] = 'true'\n",
			
 
				+    "os.environ['LANGCHAIN_ENDPOINT'] = 'https://api.smith.langchain.com'\n",
			
 
				+    "os.environ['LANGCHAIN_API_KEY'] = 'LANGCHAIN_API_KEY'\n",
			
 
				+    "\n",
			
 
				+    "os.environ['TAVILY_API_KEY'] = 'TAVILY_API_KEY'"
			
 
				    ]
			
 
				   },
			
 
				   {
			
 
				    "cell_type": "code",
			
 
				-   "execution_count": 1,
			
 
				+   "execution_count": null,
			
 
				    "id": "2096d49c-d3dc-4329-ada7-aff56d210198",
			
 
				    "metadata": {},
			
 
				    "outputs": [],
			
@@ -102,7 +116,7 @@
 
				   },
			
 
				   {
			
 
				    "cell_type": "code",
			
 
				-   "execution_count": 4,
			
 
				+   "execution_count": null,
			
 
				    "id": "267c63e1-4c2f-439d-8d95-4c6aa01f41cf",
			
 
				    "metadata": {},
			
 
				    "outputs": [],
			
@@ -112,7 +126,7 @@
 
				     "from langchain.text_splitter import RecursiveCharacterTextSplitter\n",
			
 
				     "from langchain_community.document_loaders import WebBaseLoader\n",
			
 
				     "from langchain_community.vectorstores import Chroma\n",
			
 
				-    "from langchain_community.embeddings import GPT4AllEmbeddings\n",
			
 
				+    "from langchain_community.embeddings import HuggingFaceEmbeddings\n",
			
 
				     "\n",
			
 
				     "urls = [\n",
			
 
				     "    \"https://lilianweng.github.io/posts/2023-06-23-agent/\",\n",
			
@@ -132,25 +146,17 @@
 
				     "vectorstore = Chroma.from_documents(\n",
			
 
				     "    documents=doc_splits,\n",
			
 
				     "    collection_name=\"rag-chroma\",\n",
			
 
				-    "    embedding=GPT4AllEmbeddings(),\n",
			
 
				+    "    embedding=HuggingFaceEmbeddings(),\n",
			
 
				     ")\n",
			
 
				     "retriever = vectorstore.as_retriever()"
			
 
				    ]
			
 
				   },
			
 
				   {
			
 
				    "cell_type": "code",
			
 
				-   "execution_count": 5,
			
 
				+   "execution_count": null,
			
 
				    "id": "b008df98-8394-49da-8fb8-aefe2c90d03c",
			
 
				    "metadata": {},
			
 
				-   "outputs": [
			
 
				-    {
			
 
				-     "name": "stdout",
			
 
				-     "output_type": "stream",
			
 
				-     "text": [
			
 
				-      "{'score': 'yes'}\n"
			
 
				-     ]
			
 
				-    }
			
 
				-   ],
			
 
				+   "outputs": [],
			
 
				    "source": [
			
 
				     "### Retrieval Grader \n",
			
 
				     "\n",
			
@@ -162,14 +168,18 @@
 
				     "llm = ChatOllama(model=local_llm, format=\"json\", temperature=0)\n",
			
 
				     "\n",
			
 
				     "prompt = PromptTemplate(\n",
			
 
				-    "    template=\"\"\"<|begin_of_text|><|start_header_id|>system<|end_header_id|> You are a grader assessing relevance \n",
			
 
				+    "    template=\"\"\"You are a grader assessing relevance \n",
			
 
				     "    of a retrieved document to a user question. If the document contains keywords related to the user question, \n",
			
 
				-    "    grade it as relevant. It does not need to be a stringent test. The goal is to filter out erroneous retrievals. \\n\n",
			
 
				-    "    Give a binary score 'yes' or 'no' score to indicate whether the document is relevant to the question. \\n\n",
			
 
				+    "    grade it as relevant. It does not need to be a stringent test. The goal is to filter out erroneous retrievals. \n",
			
 
				+    "    \n",
			
 
				+    "    Give a binary score 'yes' or 'no' score to indicate whether the document is relevant to the question.\n",
			
 
				     "    Provide the binary score as a JSON with a single key 'score' and no premable or explaination.\n",
			
 
				-    "     <|eot_id|><|start_header_id|>user<|end_header_id|>\n",
			
 
				-    "    Here is the retrieved document: \\n\\n {document} \\n\\n\n",
			
 
				-    "    Here is the user question: {question} \\n <|eot_id|><|start_header_id|>assistant<|end_header_id|>\n",
			
 
				+    "     \n",
			
 
				+    "    Here is the retrieved document: \n",
			
 
				+    "    {document}\n",
			
 
				+    "    \n",
			
 
				+    "    Here is the user question: \n",
			
 
				+    "    {question}\n",
			
 
				     "    \"\"\",\n",
			
 
				     "    input_variables=[\"question\", \"document\"],\n",
			
 
				     ")\n",
			
@@ -183,18 +193,10 @@
 
				   },
			
 
				   {
			
 
				    "cell_type": "code",
			
 
				-   "execution_count": 6,
			
 
				+   "execution_count": null,
			
 
				    "id": "1d531a81-6d4d-405e-975a-01ef1c9679fa",
			
 
				    "metadata": {},
			
 
				-   "outputs": [
			
 
				-    {
			
 
				-     "name": "stdout",
			
 
				-     "output_type": "stream",
			
 
				-     "text": [
			
 
				-      "The context mentions that the memory component of an LLM-powered autonomous agent system includes a long-term memory module (external database) that records a comprehensive list of agents' experience in natural language, referred to as \"memory stream\". This suggests that the agent has some form of memory or recall mechanism.\n"
			
 
				-     ]
			
 
				-    }
			
 
				-   ],
			
 
				+   "outputs": [],
			
 
				    "source": [
			
 
				     "### Generate\n",
			
 
				     "\n",
			
@@ -204,12 +206,13 @@
 
				     "\n",
			
 
				     "# Prompt\n",
			
 
				     "prompt = PromptTemplate(\n",
			
 
				-    "    template=\"\"\"<|begin_of_text|><|start_header_id|>system<|end_header_id|> You are an assistant for question-answering tasks. \n",
			
 
				+    "    template=\"\"\"You are an assistant for question-answering tasks. \n",
			
 
				     "    Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. \n",
			
 
				-    "    Use three sentences maximum and keep the answer concise <|eot_id|><|start_header_id|>user<|end_header_id|>\n",
			
 
				+    "    Use three sentences maximum and keep the answer concise:\n",
			
 
				     "    Question: {question} \n",
			
 
				     "    Context: {context} \n",
			
 
				-    "    Answer: <|eot_id|><|start_header_id|>assistant<|end_header_id|>\"\"\",\n",
			
 
				+    "    Answer: \n",
			
 
				+    "    \"\"\",\n",
			
 
				     "    input_variables=[\"question\", \"document\"],\n",
			
 
				     ")\n",
			
 
				     "\n",
			
@@ -231,21 +234,10 @@
 
				   },
			
 
				   {
			
 
				    "cell_type": "code",
			
 
				-   "execution_count": 7,
			
 
				+   "execution_count": null,
			
 
				    "id": "0261a9a4-de13-4dd8-b082-95305a3e43ca",
			
 
				    "metadata": {},
			
 
				-   "outputs": [
			
 
				-    {
			
 
				-     "data": {
			
 
				-      "text/plain": [
			
 
				-       "{'score': 'yes'}"
			
 
				-      ]
			
 
				-     },
			
 
				-     "execution_count": 7,
			
 
				-     "metadata": {},
			
 
				-     "output_type": "execute_result"
			
 
				-    }
			
 
				-   ],
			
 
				+   "outputs": [],
			
 
				    "source": [
			
 
				     "### Hallucination Grader \n",
			
 
				     "\n",
			
@@ -254,15 +246,17 @@
 
				     "\n",
			
 
				     "# Prompt\n",
			
 
				     "prompt = PromptTemplate(\n",
			
 
				-    "    template=\"\"\" <|begin_of_text|><|start_header_id|>system<|end_header_id|> You are a grader assessing whether \n",
			
 
				+    "    template=\"\"\"You are a grader assessing whether \n",
			
 
				     "    an answer is grounded in / supported by a set of facts. Give a binary score 'yes' or 'no' score to indicate \n",
			
 
				     "    whether the answer is grounded in / supported by a set of facts. Provide the binary score as a JSON with a \n",
			
 
				-    "    single key 'score' and no preamble or explanation. <|eot_id|><|start_header_id|>user<|end_header_id|>\n",
			
 
				+    "    single key 'score' and no preamble or explanation.\n",
			
 
				+    "    \n",
			
 
				     "    Here are the facts:\n",
			
 
				-    "    \\n ------- \\n\n",
			
 
				     "    {documents} \n",
			
 
				-    "    \\n ------- \\n\n",
			
 
				-    "    Here is the answer: {generation}  <|eot_id|><|start_header_id|>assistant<|end_header_id|>\"\"\",\n",
			
 
				+    "\n",
			
 
				+    "    Here is the answer: \n",
			
 
				+    "    {generation}\n",
			
 
				+    "    \"\"\",\n",
			
 
				     "    input_variables=[\"generation\", \"documents\"],\n",
			
 
				     ")\n",
			
 
				     "\n",
			
@@ -272,21 +266,10 @@
 
				   },
			
 
				   {
			
 
				    "cell_type": "code",
			
 
				-   "execution_count": 8,
			
 
				+   "execution_count": null,
			
 
				    "id": "df9f6944-4fee-4971-b3a7-2b81b44ed433",
			
 
				    "metadata": {},
			
 
				-   "outputs": [
			
 
				-    {
			
 
				-     "data": {
			
 
				-      "text/plain": [
			
 
				-       "{'score': 'yes'}"
			
 
				-      ]
			
 
				-     },
			
 
				-     "execution_count": 8,
			
 
				-     "metadata": {},
			
 
				-     "output_type": "execute_result"
			
 
				-    }
			
 
				-   ],
			
 
				+   "outputs": [],
			
 
				    "source": [
			
 
				     "### Answer Grader \n",
			
 
				     "\n",
			
@@ -295,14 +278,15 @@
 
				     "\n",
			
 
				     "# Prompt\n",
			
 
				     "prompt = PromptTemplate(\n",
			
 
				-    "    template=\"\"\"<|begin_of_text|><|start_header_id|>system<|end_header_id|> You are a grader assessing whether an \n",
			
 
				+    "    template=\"\"\"You are a grader assessing whether an \n",
			
 
				     "    answer is useful to resolve a question. Give a binary score 'yes' or 'no' to indicate whether the answer is \n",
			
 
				     "    useful to resolve a question. Provide the binary score as a JSON with a single key 'score' and no preamble or explanation.\n",
			
 
				-    "     <|eot_id|><|start_header_id|>user<|end_header_id|> Here is the answer:\n",
			
 
				-    "    \\n ------- \\n\n",
			
 
				+    "     \n",
			
 
				+    "    Here is the answer:\n",
			
 
				     "    {generation} \n",
			
 
				-    "    \\n ------- \\n\n",
			
 
				-    "    Here is the question: {question} <|eot_id|><|start_header_id|>assistant<|end_header_id|>\"\"\",\n",
			
 
				+    "\n",
			
 
				+    "    Here is the question: {question}\n",
			
 
				+    "    \"\"\",\n",
			
 
				     "    input_variables=[\"generation\", \"question\"],\n",
			
 
				     ")\n",
			
 
				     "\n",
			
@@ -312,26 +296,10 @@
 
				   },
			
 
				   {
			
 
				    "cell_type": "code",
			
 
				-   "execution_count": 9,
			
 
				+   "execution_count": null,
			
 
				    "id": "a9c910c1-738c-4bf7-bf9e-801862b227eb",
			
 
				    "metadata": {},
			
 
				-   "outputs": [
			
 
				-    {
			
 
				-     "name": "stderr",
			
 
				-     "output_type": "stream",
			
 
				-     "text": [
			
 
				-      "/Users/rlm/miniforge3/envs/llama-test-env/lib/python3.11/site-packages/langchain_core/_api/deprecation.py:119: LangChainDeprecationWarning: The method `BaseRetriever.get_relevant_documents` was deprecated in langchain-core 0.1.46 and will be removed in 0.3.0. Use invoke instead.\n",
			
 
				-      "  warn_deprecated(\n"
			
 
				-     ]
			
 
				-    },
			
 
				-    {
			
 
				-     "name": "stdout",
			
 
				-     "output_type": "stream",
			
 
				-     "text": [
			
 
				-      "{'datasource': 'vectorstore'}\n"
			
 
				-     ]
			
 
				-    }
			
 
				-   ],
			
 
				+   "outputs": [],
			
 
				    "source": [
			
 
				     "### Router\n",
			
 
				     "\n",
			
@@ -343,12 +311,15 @@
 
				     "llm = ChatOllama(model=local_llm, format=\"json\", temperature=0)\n",
			
 
				     "\n",
			
 
				     "prompt = PromptTemplate(\n",
			
 
				-    "    template=\"\"\"<|begin_of_text|><|start_header_id|>system<|end_header_id|> You are an expert at routing a \n",
			
 
				+    "    template=\"\"\"You are an expert at routing a \n",
			
 
				     "    user question to a vectorstore or web search. Use the vectorstore for questions on LLM  agents, \n",
			
 
				     "    prompt engineering, and adversarial attacks. You do not need to be stringent with the keywords \n",
			
 
				     "    in the question related to these topics. Otherwise, use web-search. Give a binary choice 'web_search' \n",
			
 
				     "    or 'vectorstore' based on the question. Return the a JSON with a single key 'datasource' and \n",
			
 
				-    "    no premable or explaination. Question to route: {question} <|eot_id|><|start_header_id|>assistant<|end_header_id|>\"\"\",\n",
			
 
				+    "    no premable or explaination. \n",
			
 
				+    "    \n",
			
 
				+    "    Question to route: \n",
			
 
				+    "    {question}\"\"\",\n",
			
 
				     "    input_variables=[\"question\"],\n",
			
 
				     ")\n",
			
 
				     "\n",
			
@@ -361,7 +332,7 @@
 
				   },
			
 
				   {
			
 
				    "cell_type": "code",
			
 
				-   "execution_count": 10,
			
 
				+   "execution_count": null,
			
 
				    "id": "023ff2db-eb4e-4d44-904c-ea061abc16d9",
			
 
				    "metadata": {},
			
 
				    "outputs": [],
			
@@ -382,7 +353,7 @@
 
				   },
			
 
				   {
			
 
				    "cell_type": "code",
			
 
				-   "execution_count": 11,
			
 
				+   "execution_count": null,
			
 
				    "id": "07fa3d08-6a86-4705-a28b-e2721070bc5e",
			
 
				    "metadata": {},
			
 
				    "outputs": [],
			
@@ -616,7 +587,7 @@
 
				   },
			
 
				   {
			
 
				    "cell_type": "code",
			
 
				-   "execution_count": 12,
			
 
				+   "execution_count": null,
			
 
				    "id": "d9a4b9e4-3ba8-47d6-958c-e5a7112ac6f4",
			
 
				    "metadata": {},
			
 
				    "outputs": [],
			
@@ -653,49 +624,10 @@
 
				   },
			
 
				   {
			
 
				    "cell_type": "code",
			
 
				-   "execution_count": 13,
			
 
				+   "execution_count": null,
			
 
				    "id": "13043b0f-17c7-49d3-9ea7-8f2c0f0c8691",
			
 
				    "metadata": {},
			
 
				-   "outputs": [
			
 
				-    {
			
 
				-     "name": "stdout",
			
 
				-     "output_type": "stream",
			
 
				-     "text": [
			
 
				-      "---ROUTE QUESTION---\n",
			
 
				-      "What are the types of agent memory?\n",
			
 
				-      "{'datasource': 'vectorstore'}\n",
			
 
				-      "vectorstore\n",
			
 
				-      "---ROUTE QUESTION TO RAG---\n",
			
 
				-      "---RETRIEVE---\n",
			
 
				-      "'Finished running: retrieve:'\n",
			
 
				-      "---CHECK DOCUMENT RELEVANCE TO QUESTION---\n",
			
 
				-      "---GRADE: DOCUMENT RELEVANT---\n",
			
 
				-      "---GRADE: DOCUMENT RELEVANT---\n",
			
 
				-      "---GRADE: DOCUMENT RELEVANT---\n",
			
 
				-      "---GRADE: DOCUMENT RELEVANT---\n",
			
 
				-      "---ASSESS GRADED DOCUMENTS---\n",
			
 
				-      "---DECISION: GENERATE---\n",
			
 
				-      "'Finished running: grade_documents:'\n",
			
 
				-      "---GENERATE---\n",
			
 
				-      "---CHECK HALLUCINATIONS---\n",
			
 
				-      "---DECISION: GENERATION IS GROUNDED IN DOCUMENTS---\n",
			
 
				-      "---GRADE GENERATION vs QUESTION---\n",
			
 
				-      "---DECISION: GENERATION ADDRESSES QUESTION---\n",
			
 
				-      "'Finished running: generate:'\n",
			
 
				-      "('According to the provided context, there are several types of memory '\n",
			
 
				-      " 'mentioned:\\n'\n",
			
 
				-      " '\\n'\n",
			
 
				-      " '1. Sensory Memory: This is the earliest stage of memory, providing the '\n",
			
 
				-      " 'ability to retain impressions of sensory information (visual, auditory, etc) '\n",
			
 
				-      " 'after the original stimuli have ended.\\n'\n",
			
 
				-      " '2. Maximum Inner Product Search (MIPS): This is a long-term memory module '\n",
			
 
				-      " \"that records a comprehensive list of agents' experience in natural \"\n",
			
 
				-      " 'language.\\n'\n",
			
 
				-      " '\\n'\n",
			
 
				-      " 'These are the types of agent memory mentioned in the context.')\n"
			
 
				-     ]
			
 
				-    }
			
 
				-   ],
			
 
				+   "outputs": [],
			
 
				    "source": [
			
 
				     "# Compile\n",
			
 
				     "app = workflow.compile()\n",
			
@@ -721,32 +653,10 @@
 
				   },
			
 
				   {
			
 
				    "cell_type": "code",
			
 
				-   "execution_count": 14,
			
 
				+   "execution_count": null,
			
 
				    "id": "fbfcec3e-a09a-40b4-9c15-fead97bf4e0a",
			
 
				    "metadata": {},
			
 
				-   "outputs": [
			
 
				-    {
			
 
				-     "name": "stdout",
			
 
				-     "output_type": "stream",
			
 
				-     "text": [
			
 
				-      "---ROUTE QUESTION---\n",
			
 
				-      "Who are the Bears expected to draft first in the NFL draft?\n",
			
 
				-      "{'datasource': 'web_search'}\n",
			
 
				-      "web_search\n",
			
 
				-      "---ROUTE QUESTION TO WEB SEARCH---\n",
			
 
				-      "---WEB SEARCH---\n",
			
 
				-      "'Finished running: websearch:'\n",
			
 
				-      "---GENERATE---\n",
			
 
				-      "---CHECK HALLUCINATIONS---\n",
			
 
				-      "---DECISION: GENERATION IS GROUNDED IN DOCUMENTS---\n",
			
 
				-      "---GRADE GENERATION vs QUESTION---\n",
			
 
				-      "---DECISION: GENERATION ADDRESSES QUESTION---\n",
			
 
				-      "'Finished running: generate:'\n",
			
 
				-      "('The Bears are expected to draft Caleb Williams, a quarterback from USC, as '\n",
			
 
				-      " 'their first pick in the NFL draft.')\n"
			
 
				-     ]
			
 
				-    }
			
 
				-   ],
			
 
				+   "outputs": [],
			
 
				    "source": [
			
 
				     "# Compile\n",
			
 
				     "app = workflow.compile()\n",
			
--- a/recipes/use_cases/langchain/langgraph-rag-agent.ipynb
+++ b/recipes/use_cases/langchain/langgraph-rag-agent.ipynb
@@ -1,13 +1,21 @@
 
				 {
			
 
				  "cells": [
			
 
				   {
			
 
				+   "cell_type": "markdown",
			
 
				+   "id": "6912ab05-f66a-40a9-a4a5-4deb80d2e0d9",
			
 
				+   "metadata": {},
			
 
				+   "source": [
			
 
				+    "<a href=\"https://colab.research.google.com/github/meta-llama/llama-recipes/blob/main/recipes/use_cases/agents/langchain/langgraph-rag-agent.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
			
 
				+   ]
			
 
				+  },
			
 
				+  {
			
 
				    "cell_type": "code",
			
 
				    "execution_count": null,
			
 
				-   "id": "7a602447-8ea8-46b9-98fd-9e4dce65a68c",
			
 
				+   "id": "79e9c850-d829-4266-a2b6-4e69ad24e30e",
			
 
				    "metadata": {},
			
 
				    "outputs": [],
			
 
				    "source": [
			
 
				-    "! pip install -U langchain_groq langchain langgraph langchain_community langchain_openai tavily-python tiktoken langchainhub chromadb"
			
 
				+    "! pip install -U langchain_groq langchain langgraph langchain_community sentence_transformers tavily-python tiktoken langchainhub chromadb"
			
 
				    ]
			
 
				   },
			
 
				   {
			
@@ -20,15 +28,15 @@
 
				    "id": "783f3dba-888c-4b74-a29e-b5d7a2386712",
			
 
				    "metadata": {},
			
 
				    "source": [
			
 
				-    "# LangGraph RAG agent with LLaMA3\n",
			
 
				+    "# LangGraph RAG agent with Llama 3\n",
			
 
				     "\n",
			
 
				-    "Previously, we showed how to build simple agents with LangGraph and Llama3.\n",
			
 
				+    "Previously, we showed how to build simple agents with LangGraph and Llama 3.\n",
			
 
				     "\n",
			
 
				     "Now, we'll pick a more advanced use-case: advanced RAG.\n",
			
 
				     "\n",
			
 
				     "## Ideas\n",
			
 
				     "\n",
			
 
				-    "We'll combine ideas from paper RAG papers into a RAG agent:\n",
			
 
				+    "We'll combine ideas from three RAG papers into a RAG agent:\n",
			
 
				     "\n",
			
 
				     "- **Routing:**  Adaptive RAG ([paper](https://arxiv.org/abs/2403.14403)). Route questions to different retrieval approaches\n",
			
 
				     "- **Fallback:** Corrective RAG ([paper](https://arxiv.org/pdf/2401.15884.pdf)). Fallback to web search if docs are not relevant to query\n",
			
@@ -46,7 +54,7 @@
 
				     "\n",
			
 
				     "### LLM\n",
			
 
				     "\n",
			
 
				-    "We can use one of the providers that (1) offer llama3 and (2) [provide structure outputs](https://python.langchain.com/docs/modules/model_io/chat/structured_output/).\n",
			
 
				+    "We can use one of the providers that (1) offer Llama 3 and (2) [provide structure outputs](https://python.langchain.com/docs/modules/model_io/chat/structured_output/).\n",
			
 
				     "\n",
			
 
				     "Here, we use [Groq](https://groq.com/).\n",
			
 
				     "\n",
			
@@ -56,7 +64,7 @@
 
				     "### Tracing (optional)\n",
			
 
				     "os.environ['LANGCHAIN_TRACING_V2'] = 'true'\n",
			
 
				     "os.environ['LANGCHAIN_ENDPOINT'] = 'https://api.smith.langchain.com'\n",
			
 
				-    "os.environ['LANGCHAIN_API_KEY'] = <your-api-key>\n",
			
 
				+    "os.environ['LANGCHAIN_API_KEY'] = 'LANGCHAIN_API_KEY'\n",
			
 
				     "```\n",
			
 
				     "\n",
			
 
				     "### Search\n",
			
@@ -73,14 +81,18 @@
 
				    "source": [
			
 
				     "### LLMs\n",
			
 
				     "import os\n",
			
 
				-    "os.environ['OPENAI_API_KEY'] = <your-api-key>\n",
			
 
				-    "os.environ['GROQ_API_KEY'] = <your-api-key>\n",
			
 
				-    "os.environ['TAVILY_API_KEY'] = <your-api-key>"
			
 
				+    "\n",
			
 
				+    "os.environ['LANGCHAIN_TRACING_V2'] = 'true'\n",
			
 
				+    "os.environ['LANGCHAIN_ENDPOINT'] = 'https://api.smith.langchain.com'\n",
			
 
				+    "os.environ['LANGCHAIN_API_KEY'] = 'LANGCHAIN_API_KEY'\n",
			
 
				+    "\n",
			
 
				+    "os.environ['TAVILY_API_KEY'] = 'YOUR_TAVILY_API_KEY'\n",
			
 
				+    "os.environ['GROQ_API_KEY'] = 'YOUR_GROQ_API_KEY'"
			
 
				    ]
			
 
				   },
			
 
				   {
			
 
				    "cell_type": "code",
			
 
				-   "execution_count": 1,
			
 
				+   "execution_count": null,
			
 
				    "id": "27a89322-0e88-4886-bcb4-3ac9bc1db316",
			
 
				    "metadata": {},
			
 
				    "outputs": [],
			
@@ -90,10 +102,7 @@
 
				     "from langchain.text_splitter import RecursiveCharacterTextSplitter\n",
			
 
				     "from langchain_community.document_loaders import WebBaseLoader\n",
			
 
				     "from langchain_community.vectorstores import Chroma\n",
			
 
				-    "from langchain_openai import OpenAIEmbeddings\n",
			
 
				-    "\n",
			
 
				-    "# Set embeddings\n",
			
 
				-    "embd = OpenAIEmbeddings()\n",
			
 
				+    "from langchain_community.embeddings import HuggingFaceEmbeddings\n",
			
 
				     "\n",
			
 
				     "# Docs to index\n",
			
 
				     "urls = [\n",
			
@@ -116,34 +125,17 @@
 
				     "vectorstore = Chroma.from_documents(\n",
			
 
				     "    documents=doc_splits,\n",
			
 
				     "    collection_name=\"rag-chroma\",\n",
			
 
				-    "    embedding=embd,\n",
			
 
				+    "    embedding=HuggingFaceEmbeddings(),\n",
			
 
				     ")\n",
			
 
				     "retriever = vectorstore.as_retriever()"
			
 
				    ]
			
 
				   },
			
 
				   {
			
 
				    "cell_type": "code",
			
 
				-   "execution_count": 2,
			
 
				+   "execution_count": null,
			
 
				    "id": "1041afc4-6680-4be0-950a-1c5b80b6b895",
			
 
				    "metadata": {},
			
 
				-   "outputs": [
			
 
				-    {
			
 
				-     "name": "stderr",
			
 
				-     "output_type": "stream",
			
 
				-     "text": [
			
 
				-      "/Users/rlm/miniforge3/envs/llama-test-env/lib/python3.11/site-packages/langchain_core/_api/beta_decorator.py:87: LangChainBetaWarning: The method `ChatGroq.with_structured_output` is in beta. It is actively being worked on, so the API may change.\n",
			
 
				-      "  warn_beta(\n"
			
 
				-     ]
			
 
				-    },
			
 
				-    {
			
 
				-     "name": "stdout",
			
 
				-     "output_type": "stream",
			
 
				-     "text": [
			
 
				-      "datasource='web_search'\n",
			
 
				-      "datasource='vectorstore'\n"
			
 
				-     ]
			
 
				-    }
			
 
				-   ],
			
 
				+   "outputs": [],
			
 
				    "source": [
			
 
				     "### Router\n",
			
 
				     "\n",
			
@@ -184,26 +176,10 @@
 
				   },
			
 
				   {
			
 
				    "cell_type": "code",
			
 
				-   "execution_count": 3,
			
 
				+   "execution_count": null,
			
 
				    "id": "3bd753fb-9330-4ffa-8721-f133dc8a86aa",
			
 
				    "metadata": {},
			
 
				-   "outputs": [
			
 
				-    {
			
 
				-     "name": "stderr",
			
 
				-     "output_type": "stream",
			
 
				-     "text": [
			
 
				-      "/Users/rlm/miniforge3/envs/llama-test-env/lib/python3.11/site-packages/langchain_core/_api/deprecation.py:119: LangChainDeprecationWarning: The method `BaseRetriever.get_relevant_documents` was deprecated in langchain-core 0.1.46 and will be removed in 0.3.0. Use invoke instead.\n",
			
 
				-      "  warn_deprecated(\n"
			
 
				-     ]
			
 
				-    },
			
 
				-    {
			
 
				-     "name": "stdout",
			
 
				-     "output_type": "stream",
			
 
				-     "text": [
			
 
				-      "score='yes'\n"
			
 
				-     ]
			
 
				-    }
			
 
				-   ],
			
 
				+   "outputs": [],
			
 
				    "source": [
			
 
				     "### Retrieval Grader \n",
			
 
				     "\n",
			
@@ -238,18 +214,10 @@
 
				   },
			
 
				   {
			
 
				    "cell_type": "code",
			
 
				-   "execution_count": 4,
			
 
				+   "execution_count": null,
			
 
				    "id": "2d1fc9af-3426-491a-9d1c-3ccb3b7aba1a",
			
 
				    "metadata": {},
			
 
				-   "outputs": [
			
 
				-    {
			
 
				-     "name": "stdout",
			
 
				-     "output_type": "stream",
			
 
				-     "text": [
			
 
				-      "The agent's memory is referred to as a \"memory stream\", which is a long-term memory module that records a comprehensive list of agents' experiences in natural language. Each element in the memory stream is an observation or event directly provided by the agent.\n"
			
 
				-     ]
			
 
				-    }
			
 
				-   ],
			
 
				+   "outputs": [],
			
 
				    "source": [
			
 
				     "### Generate\n",
			
 
				     "\n",
			
@@ -276,21 +244,10 @@
 
				   },
			
 
				   {
			
 
				    "cell_type": "code",
			
 
				-   "execution_count": 10,
			
 
				+   "execution_count": null,
			
 
				    "id": "c0e522e7-4347-4b80-9972-8c9ed582995e",
			
 
				    "metadata": {},
			
 
				-   "outputs": [
			
 
				-    {
			
 
				-     "data": {
			
 
				-      "text/plain": [
			
 
				-       "GradeHallucinations(score='yes')"
			
 
				-      ]
			
 
				-     },
			
 
				-     "execution_count": 10,
			
 
				-     "metadata": {},
			
 
				-     "output_type": "execute_result"
			
 
				-    }
			
 
				-   ],
			
 
				+   "outputs": [],
			
 
				    "source": [
			
 
				     "### Hallucination Grader \n",
			
 
				     "\n",
			
@@ -320,21 +277,10 @@
 
				   },
			
 
				   {
			
 
				    "cell_type": "code",
			
 
				-   "execution_count": 5,
			
 
				+   "execution_count": null,
			
 
				    "id": "daf65df4-72a2-4805-92a4-0025cb7db5ac",
			
 
				    "metadata": {},
			
 
				-   "outputs": [
			
 
				-    {
			
 
				-     "data": {
			
 
				-      "text/plain": [
			
 
				-       "GradeAnswer(score='yes')"
			
 
				-      ]
			
 
				-     },
			
 
				-     "execution_count": 5,
			
 
				-     "metadata": {},
			
 
				-     "output_type": "execute_result"
			
 
				-    }
			
 
				-   ],
			
 
				+   "outputs": [],
			
 
				    "source": [
			
 
				     "### Answer Grader \n",
			
 
				     "\n",
			
@@ -364,7 +310,7 @@
 
				   },
			
 
				   {
			
 
				    "cell_type": "code",
			
 
				-   "execution_count": 6,
			
 
				+   "execution_count": null,
			
 
				    "id": "06a0af45-8b57-4ac6-a034-b368309f02cf",
			
 
				    "metadata": {},
			
 
				    "outputs": [],
			
@@ -377,7 +323,7 @@
 
				   },
			
 
				   {
			
 
				    "cell_type": "code",
			
 
				-   "execution_count": 11,
			
 
				+   "execution_count": null,
			
 
				    "id": "2b1c0474-4cd1-4802-b49b-dcbb70842411",
			
 
				    "metadata": {},
			
 
				    "outputs": [],
			
@@ -600,7 +546,7 @@
 
				   },
			
 
				   {
			
 
				    "cell_type": "code",
			
 
				-   "execution_count": 12,
			
 
				+   "execution_count": null,
			
 
				    "id": "26a0ad7a-87d3-4e32-9608-ab9e4408ad76",
			
 
				    "metadata": {},
			
 
				    "outputs": [],
			
@@ -637,37 +583,10 @@
 
				   },
			
 
				   {
			
 
				    "cell_type": "code",
			
 
				-   "execution_count": 13,
			
 
				+   "execution_count": null,
			
 
				    "id": "1a2af1cb-4bba-4baf-9345-d21ce0e65503",
			
 
				    "metadata": {},
			
 
				-   "outputs": [
			
 
				-    {
			
 
				-     "name": "stdout",
			
 
				-     "output_type": "stream",
			
 
				-     "text": [
			
 
				-      "---ROUTE QUESTION---\n",
			
 
				-      "---ROUTE QUESTION TO RAG---\n",
			
 
				-      "---RETRIEVE---\n",
			
 
				-      "'Finished running: retrieve:'\n",
			
 
				-      "---CHECK DOCUMENT RELEVANCE TO QUESTION---\n",
			
 
				-      "---GRADE: DOCUMENT RELEVANT---\n",
			
 
				-      "---GRADE: DOCUMENT RELEVANT---\n",
			
 
				-      "---GRADE: DOCUMENT RELEVANT---\n",
			
 
				-      "---GRADE: DOCUMENT RELEVANT---\n",
			
 
				-      "---ASSESS GRADED DOCUMENTS---\n",
			
 
				-      "---DECISION: GENERATE---\n",
			
 
				-      "'Finished running: grade_documents:'\n",
			
 
				-      "---GENERATE---\n",
			
 
				-      "---CHECK HALLUCINATIONS---\n",
			
 
				-      "---DECISION: GENERATION IS GROUNDED IN DOCUMENTS---\n",
			
 
				-      "---GRADE GENERATION vs QUESTION---\n",
			
 
				-      "---DECISION: GENERATION ADDRESSES QUESTION---\n",
			
 
				-      "'Finished running: generate:'\n",
			
 
				-      "('The types of agent memory are: sensory memory, short-term memory (also known '\n",
			
 
				-      " 'as working memory), and long-term memory.')\n"
			
 
				-     ]
			
 
				-    }
			
 
				-   ],
			
 
				+   "outputs": [],
			
 
				    "source": [
			
 
				     "# Compile\n",
			
 
				     "app = workflow.compile()\n",
			
--- a/recipes/use_cases/langchain/tool-calling-agent.ipynb
+++ b/recipes/use_cases/langchain/tool-calling-agent.ipynb