| 
					
				 | 
			
			
				@@ -0,0 +1,385 @@ 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+{ 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+ "cells": [ 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  { 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "cell_type": "markdown", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "id": "30b1235c-2f3e-4628-9c90-30385f741550", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "metadata": {}, 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "source": [ 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "## This demo app shows:\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "* How to use LangChain's YoutubeLoader to retrieve the caption in a YouTube video\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "* How to ask Llama to summarize the content (per the Llama's input size limit) of the video in a naive way using LangChain's stuff method\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "* How to bypass the limit of Llama's max input token size by using a more sophisticated way using LangChain's map_reduce and refine methods - see [here](https://python.langchain.com/docs/use_cases/summarization) for more info" 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   ] 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  }, 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  { 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "cell_type": "markdown", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "id": "c866f6be", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "metadata": {}, 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "source": [ 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "We start by installing the necessary packages:\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "- [youtube-transcript-api](https://pypi.org/project/youtube-transcript-api/) API to get transcript/subtitles of a YouTube video\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "- [langchain](https://python.langchain.com/docs/get_started/introduction) provides necessary RAG tools for this demo\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "- [tiktoken](https://github.com/openai/tiktoken) BytePair Encoding tokenizer\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "- [pytube](https://pytube.io/en/latest/) Utility for downloading YouTube videos\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "**Note** This example uses OctoAI to host the Llama model. If you have not set up/or used OctoAI before, we suggest you take a look at the [HelloLlamaCloud](HelloLlamaCloud.ipynb) example for information on how to set up OctoAI before continuing with this example.\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "If you do not want to use OctoAI, you will need to make some changes to this notebook as you go along." 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   ] 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  }, 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  { 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "cell_type": "code", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "execution_count": null, 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "id": "02482167", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "metadata": {}, 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "outputs": [], 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "source": [ 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "!pip install langchain octoai-sdk youtube-transcript-api tiktoken pytube" 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   ] 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  }, 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  { 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "cell_type": "markdown", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "id": "af3069b1", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "metadata": {}, 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "source": [ 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "Let's load the YouTube video transcript using the YoutubeLoader." 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   ] 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  }, 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  { 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "cell_type": "code", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "execution_count": null, 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "id": "3e4b8598", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "metadata": {}, 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "outputs": [], 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "source": [ 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "from langchain.document_loaders import YoutubeLoader\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "loader = YoutubeLoader.from_youtube_url(\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "    \"https://www.youtube.com/watch?v=1k37OcjH7BM\", add_video_info=True\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    ")" 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   ] 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  }, 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  { 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "cell_type": "code", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "execution_count": null, 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "id": "dca32ebb", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "metadata": {}, 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "outputs": [], 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "source": [ 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "# load the youtube video caption into Documents\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "docs = loader.load()" 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   ] 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  }, 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  { 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "cell_type": "code", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "execution_count": null, 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "id": "afba128f-b7fd-4b2f-873f-9b5163455d54", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "metadata": {}, 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "outputs": [], 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "source": [ 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "# check the docs length and content\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "len(docs[0].page_content), docs[0].page_content[:300]" 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   ] 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  }, 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  { 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "cell_type": "markdown", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "id": "4af7cc16", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "metadata": {}, 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "source": [ 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "We are using OctoAI in this example to host our Llama 2 model so you will need to get a OctoAI token.\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "To get the OctoAI token:\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "- You will need to first sign in with OctoAI with your github account\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "- Then create a free API token [here](https://octo.ai/docs/getting-started/how-to-create-an-octoai-access-token) that you can use for a while (a month or $10 in OctoAI credits, whichever one runs out first)\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "**Note** After the free trial ends, you will need to enter billing info to continue to use Llama2 hosted on OctoAI.\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "Alternatively, you can run Llama locally. See:\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "- [HelloLlamaLocal](HelloLlamaLocal.ipynb) for further information on how to run Llama locally." 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   ] 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  }, 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  { 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "cell_type": "code", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "execution_count": null, 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "id": "ab3ac00e", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "metadata": {}, 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "outputs": [], 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "source": [ 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "# enter your OctoAI API token, or you can use local Llama. See README for more info\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "from getpass import getpass\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "import os\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "OCTOAI_API_TOKEN = getpass()\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "os.environ[\"OCTOAI_API_TOKEN\"] = OCTOAI_API_TOKEN\n" 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   ] 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  }, 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  { 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "cell_type": "markdown", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "id": "6b911efd", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "metadata": {}, 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "source": [ 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "Next we call the Llama 2 model from OctoAI. In this example we will use the Llama 2 13b chat FP16 model. You can find more on Llama 2 models on the [OctoAI text generation solution page](https://octoai.cloud/tools/text).\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "At the time of writing this notebook the following Llama models are available on OctoAI:\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "* llama-2-13b-chat-fp16\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "* llama-2-70b-chat-int4\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "* llama-2-70b-chat-fp16\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "* codellama-7b-instruct-fp16\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "* codellama-13b-instruct-fp16\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "* codellama-34b-instruct-int4\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "* codellama-34b-instruct-fp16\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "* codellama-70b-instruct-fp16\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "If you using local Llama, just set llm accordingly - see the [HelloLlamaLocal notebook](HelloLlamaLocal.ipynb)" 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   ] 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  }, 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  { 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "cell_type": "code", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "execution_count": null, 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "id": "adf8cf3d", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "metadata": {}, 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "outputs": [], 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "source": [ 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "from langchain.llms.octoai_endpoint import OctoAIEndpoint\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "llama2_13b = \"llama-2-13b-chat-fp16\"\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "llm = OctoAIEndpoint(\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "    endpoint_url=\"https://text.octoai.run/v1/chat/completions\",\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "    model_kwargs={\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "        \"model\": llama2_13b,\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "        \"messages\": [\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "            {\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "                \"role\": \"system\",\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "                \"content\": \"You are a helpful, respectful and honest assistant.\"\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "            }\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "        ],\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "        \"max_tokens\": 500,\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "        \"top_p\": 1,\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "        \"temperature\": 0.01\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "    },\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    ")" 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   ] 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  }, 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  { 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "cell_type": "markdown", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "id": "8e3baa56", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "metadata": {}, 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "source": [ 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "Once everything is set up, we prompt Llama 2 to summarize the first 4000 characters of the transcript for us." 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   ] 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  }, 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  { 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "cell_type": "code", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "execution_count": null, 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "id": "51739e11", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "metadata": {}, 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "outputs": [], 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "source": [ 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "from langchain.prompts import ChatPromptTemplate\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "from langchain.chains import LLMChain\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "prompt = ChatPromptTemplate.from_template(\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "    \"Give me a summary of the text below: {text}?\"\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    ")\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "chain = LLMChain(llm=llm, prompt=prompt)\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "# be careful of the input text length sent to LLM\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "text = docs[0].page_content[:4000]\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "summary = chain.run(text)\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "# this is the summary of the first 4000 characters of the video content\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "print(summary)" 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   ] 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  }, 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  { 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "cell_type": "markdown", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "id": "8b684b29", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "metadata": {}, 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "source": [ 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "Next we try to summarize all the content of the transcript and we should get a `RuntimeError: Your input is too long. Max input length is 4096 tokens, but you supplied 5597 tokens.`." 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   ] 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  }, 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  { 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "cell_type": "code", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "execution_count": null, 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "id": "88a2c17f", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "metadata": {}, 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "outputs": [], 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "source": [ 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "# try to get a summary of the whole content\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "text = docs[0].page_content\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "summary = chain.run(text)\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "print(summary)" 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   ] 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  }, 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  { 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "cell_type": "markdown", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "id": "1ad1881a", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "metadata": {}, 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "source": [ 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "Let's try some workarounds to see if we can summarize the entire transcript without running into the `RuntimeError`.\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "We will use the LangChain's `load_summarize_chain` and play around with the `chain_type`.\n" 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   ] 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  }, 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  { 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "cell_type": "code", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "execution_count": null, 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "id": "9bfee2d3-3afe-41d9-8968-6450cc23f493", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "metadata": {}, 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "outputs": [], 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "source": [ 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "from langchain.chains.summarize import load_summarize_chain\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "# see https://python.langchain.com/docs/use_cases/summarization for more info\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "chain = load_summarize_chain(llm, chain_type=\"stuff\") # other supported methods are map_reduce and refine\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "chain.run(docs)\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "# same RuntimeError: Your input is too long. but stuff works for shorter text with input length <= 4096 tokens" 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   ] 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  }, 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  { 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "cell_type": "code", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "execution_count": null, 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "id": "682799a8-3846-41b1-a908-02ab5ac3ecee", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "metadata": {}, 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "outputs": [], 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "source": [ 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "chain = load_summarize_chain(llm, chain_type=\"refine\")\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "# still get the \"RuntimeError: Your input is too long. Max input length is 4096 tokens\"\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "chain.run(docs)" 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   ] 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  }, 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  { 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "cell_type": "markdown", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "id": "aecf6328", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "metadata": {}, 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "source": [ 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "Since the transcript is bigger than the model can handle, we can split the transcript into chunks instead and use the [`refine`](https://python.langchain.com/docs/modules/chains/document/refine) `chain_type` to iteratively create an answer." 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   ] 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  }, 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  { 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "cell_type": "code", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "execution_count": null, 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "id": "3be1236a-fe6a-4bf6-983f-0e72dde39fee", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "metadata": {}, 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "outputs": [], 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "source": [ 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "from langchain.text_splitter import RecursiveCharacterTextSplitter\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "# we need to split the long input text\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "text_splitter = RecursiveCharacterTextSplitter.from_tiktoken_encoder(\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "    chunk_size=3000, chunk_overlap=0\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    ")\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "split_docs = text_splitter.split_documents(docs)" 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   ] 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  }, 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  { 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "cell_type": "code", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "execution_count": null, 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "id": "12ae9e9d-3434-4a84-a298-f2b98de9ff01", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "metadata": {}, 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "outputs": [], 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "source": [ 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "# check the splitted docs lengths\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "len(split_docs), len(docs), len(split_docs[0].page_content), len(docs[0].page_content)" 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   ] 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  }, 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  { 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "cell_type": "code", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "execution_count": null, 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "id": "127f17fe-d5b7-43af-bd2f-2b47b076d0b1", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "metadata": {}, 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "outputs": [], 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "source": [ 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "# now get the summary of the whole docs - the whole youtube content\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "chain = load_summarize_chain(llm, chain_type=\"refine\")\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "print(str(chain.run(split_docs)))" 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   ] 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  }, 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  { 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "cell_type": "markdown", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "id": "c3976c92", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "metadata": {}, 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "source": [ 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "You can also use [`map_reduce`](https://python.langchain.com/docs/modules/chains/document/map_reduce) `chain_type` to implement a map reduce like architecture while summarizing the documents." 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   ] 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  }, 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  { 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "cell_type": "code", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "execution_count": null, 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "id": "8991df49-8578-46de-8b30-cb2cd11e30f1", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "metadata": {}, 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "outputs": [], 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "source": [ 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "# another method is map_reduce\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "chain = load_summarize_chain(llm, chain_type=\"map_reduce\")\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "print(str(chain.run(split_docs)))" 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   ] 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  }, 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  { 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "cell_type": "markdown", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "id": "77d580de", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "metadata": {}, 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "source": [ 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "To investigate further, let's turn on Langchain's debug mode on to get an idea of how many calls are made to the model and the details of the inputs and outputs.\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "We will then run our summary using the `stuff` and `refine` `chain_types` and take a look at our output." 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   ] 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  }, 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  { 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "cell_type": "code", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "execution_count": null, 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "id": "f2138911-d2b9-41f3-870f-9bc37e2043d9", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "metadata": {}, 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "outputs": [], 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "source": [ 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "# to find how many calls to Llama have been made and the details of inputs and outputs of each call, set langchain to debug\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "import langchain\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "langchain.debug = True\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "# stuff method will cause the error in the end\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "chain = load_summarize_chain(llm, chain_type=\"stuff\")\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "chain.run(split_docs)" 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   ] 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  }, 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  { 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "cell_type": "code", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "execution_count": null, 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "id": "60d1a531-ab48-45cc-a7de-59a14e18240d", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "metadata": {}, 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "outputs": [], 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "source": [ 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "# but refine works\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "chain = load_summarize_chain(llm, chain_type=\"refine\")\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "chain.run(split_docs)" 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   ] 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  }, 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  { 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "cell_type": "markdown", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "id": "61ccd0fb-5cdb-43c4-afaf-05bc9f7cf959", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "metadata": {}, 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "source": [ 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "\n", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "As you can see, `stuff` fails because it tries to treat all the split documents as one and \"stuffs\" it into one prompt which leads to a much larger prompt than Llama 2 can handle while `refine` iteratively runs over the documents updating its answer as it goes." 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   ] 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  } 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+ ], 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+ "metadata": { 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  "kernelspec": { 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "display_name": "Python 3 (ipykernel)", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "language": "python", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "name": "python3" 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  }, 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  "language_info": { 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "codemirror_mode": { 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "name": "ipython", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+    "version": 3 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   }, 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "file_extension": ".py", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "mimetype": "text/x-python", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "name": "python", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "nbconvert_exporter": "python", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "pygments_lexer": "ipython3", 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+   "version": "3.11.6" 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  } 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+ }, 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+ "nbformat": 4, 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+ "nbformat_minor": 5 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+} 
			 |