{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "1d5d6034",
   "metadata": {},
   "source": [
    "## Part 3: Setting up RAG Example and Validating our Retrieval Pipeline\n",
    "\n",
    "We are ready for the finale, but let's recap:\n",
    "\n",
    "- We started with an example dataset of 5000 images\n",
    "- In the first notebook, we cleaned this up for labelling and used `Llama-3.2-11B` model for labelling\n",
    "- In the second notebook, we cleaned up some hallucinations of the model and pre-processed the descriptions that were synthetically generated\n",
    "\n",
    "\n",
    "Step 3 is to setup a RAG pipeline and profit.\n",
    "\n",
    "We will use [lance-db](https://lancedb.com) since it's open source and Llama and open source go well together š¤\n",
    "\n",
    "We also love free stuff and Llama partner [Together](https://www.together.ai) is hosting 11B model for free. For our final demo, we will use their API and validate the same in this example."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "id": "ee4b18be-3bf0-4f7c-8ac5-ef68b7566750",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "#!pip install lancedb rerankers together -q"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "38fce923",
   "metadata": {},
   "source": [
    "Since the outputs of LLMs are non-deterministic, we will use the uploaded CSVs from this dataset to get the same experience. \n",
    "\n",
    "In other words, the maintainers of llama-recipes don't want more complaints so we will re-use this avoid new Github issues."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "id": "fd7f755b-c9cc-468d-b4f1-8d2f9e7b3d8d",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "--2024-10-05 20:26:21--  https://huggingface.co/datasets/Sanyam/MM-Demo/resolve/main/archive.zip?download=true\n",
      "Resolving fwdproxy (fwdproxy)... 2401:db00:2ff:e002:face:b00c:0:1e10\n",
      "Connecting to fwdproxy (fwdproxy)|2401:db00:2ff:e002:face:b00c:0:1e10|:8080... connected.\n",
      "Proxy request sent, awaiting response... 302 Found\n",
      "Location: https://cdn-lfs-us-1.hf.co/repos/cd/e8/cde814535d30074dcdff9e1ffb015dfa1cc1e60be4b2906951b47df88d9571ed/89615a7f69ba75f09c19dc40e4214110134c6f533f950b2894440ee0741d8c53?response-content-disposition=attachment%3B+filename*%3DUTF-8%27%27archive.zip%3B+filename%3D%22archive.zip%22%3B&response-content-type=application%2Fzip&Expires=1728444381&Policy=eyJTdGF0ZW1lbnQiOlt7IkNvbmRpdGlvbiI6eyJEYXRlTGVzc1RoYW4iOnsiQVdTOkVwb2NoVGltZSI6MTcyODQ0NDM4MX19LCJSZXNvdXJjZSI6Imh0dHBzOi8vY2RuLWxmcy11cy0xLmhmLmNvL3JlcG9zL2NkL2U4L2NkZTgxNDUzNWQzMDA3NGRjZGZmOWUxZmZiMDE1ZGZhMWNjMWU2MGJlNGIyOTA2OTUxYjQ3ZGY4OGQ5NTcxZWQvODk2MTVhN2Y2OWJhNzVmMDljMTlkYzQwZTQyMTQxMTAxMzRjNmY1MzNmOTUwYjI4OTQ0NDBlZTA3NDFkOGM1Mz9yZXNwb25zZS1jb250ZW50LWRpc3Bvc2l0aW9uPSomcmVzcG9uc2UtY29udGVudC10eXBlPSoifV19&Signature=PRxfZ0hTsxpVy6-zXOiJGgowwXH9REo2-rZik-Fj7slmyF%7E7AbN-4JPcyKPQG3hp8k5ZjXUVjnIzUGsgPxOiUSZy3J54OlqOnYcDIAxeg6srtAqSjmWehwUFfXzWO%7E5w8165mnogdViS48MudFk8y51g0-MZMmjSlc5wRgDxahqQajR-C6Kz9mMeNdk%7EnTSF-vRQancglubpn1-p%7EzBbCuS10AstMozCw3rsOD4LMct2vytbo8lA1kvqBWLstGlq%7EkIxJdU8tk9pF%7Ef8p9lDXeR2QDU2SJ8HjIonRBslydCIr6sMN-8f2xH1y8tSWi59tUMbucBtiw-6BH6vlRzbyw__&Key-Pair-Id=K24J24Z295AEI9 [following]\n",
      "--2024-10-05 20:26:21--  https://cdn-lfs-us-1.hf.co/repos/cd/e8/cde814535d30074dcdff9e1ffb015dfa1cc1e60be4b2906951b47df88d9571ed/89615a7f69ba75f09c19dc40e4214110134c6f533f950b2894440ee0741d8c53?response-content-disposition=attachment%3B+filename*%3DUTF-8%27%27archive.zip%3B+filename%3D%22archive.zip%22%3B&response-content-type=application%2Fzip&Expires=1728444381&Policy=eyJTdGF0ZW1lbnQiOlt7IkNvbmRpdGlvbiI6eyJEYXRlTGVzc1RoYW4iOnsiQVdTOkVwb2NoVGltZSI6MTcyODQ0NDM4MX19LCJSZXNvdXJjZSI6Imh0dHBzOi8vY2RuLWxmcy11cy0xLmhmLmNvL3JlcG9zL2NkL2U4L2NkZTgxNDUzNWQzMDA3NGRjZGZmOWUxZmZiMDE1ZGZhMWNjMWU2MGJlNGIyOTA2OTUxYjQ3ZGY4OGQ5NTcxZWQvODk2MTVhN2Y2OWJhNzVmMDljMTlkYzQwZTQyMTQxMTAxMzRjNmY1MzNmOTUwYjI4OTQ0NDBlZTA3NDFkOGM1Mz9yZXNwb25zZS1jb250ZW50LWRpc3Bvc2l0aW9uPSomcmVzcG9uc2UtY29udGVudC10eXBlPSoifV19&Signature=PRxfZ0hTsxpVy6-zXOiJGgowwXH9REo2-rZik-Fj7slmyF%7E7AbN-4JPcyKPQG3hp8k5ZjXUVjnIzUGsgPxOiUSZy3J54OlqOnYcDIAxeg6srtAqSjmWehwUFfXzWO%7E5w8165mnogdViS48MudFk8y51g0-MZMmjSlc5wRgDxahqQajR-C6Kz9mMeNdk%7EnTSF-vRQancglubpn1-p%7EzBbCuS10AstMozCw3rsOD4LMct2vytbo8lA1kvqBWLstGlq%7EkIxJdU8tk9pF%7Ef8p9lDXeR2QDU2SJ8HjIonRBslydCIr6sMN-8f2xH1y8tSWi59tUMbucBtiw-6BH6vlRzbyw__&Key-Pair-Id=K24J24Z295AEI9\n",
      "Connecting to fwdproxy (fwdproxy)|2401:db00:2ff:e002:face:b00c:0:1e10|:8080... connected.\n",
      "Proxy request sent, awaiting response... 200 OK\n",
      "Length: 164078460 (156M) [application/zip]\n",
      "Saving to: āarchive.zipā\n",
      "\n",
      "archive.zip         100%[===================>] 156.48M   232MB/s    in 0.7s    \n",
      "\n",
      "2024-10-05 20:26:22 (232 MB/s) - āarchive.zipā saved [164078460/164078460]\n",
      "\n"
     ]
    }
   ],
   "source": [
    "#!wget https://huggingface.co/datasets/Sanyam/MM-Demo/resolve/main/archive.zip?download=true -O archive.zip"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "id": "3db34179-91c4-4377-b186-978bbe9d4fee",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Archive:  archive.zip\n",
      "replace __MACOSX/._archive? [y]es, [n]o, [A]ll, [N]one, [r]ename: ^C\n"
     ]
    }
   ],
   "source": [
    "#!unzip archive.zip"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "add7aeb6",
   "metadata": {},
   "source": [
    "### Loading the Dataset and Creating Embeddings"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "id": "a13b002d-26bc-4fb0-bd08-193163f2a3d7",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "import pandas as pd\n",
    "import os\n",
    "from together import Together\n",
    "\n",
    "os.environ[\"TOGETHER_API_KEY\"] = \"\"\n",
    "client = Together(api_key=os.environ.get('TOGETHER_API_KEY'))\n",
    "\n",
    "df = pd.read_csv(\"./final_balanced_sample_dataset.csv\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "id": "e2078d99-959e-4bb6-a702-ff3e31761db2",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "
\n",
       "\n",
       "
\n",
       "  \n",
       "    \n",
       "      | \n",
       " | Filename\n",
       " | Title\n",
       " | Size\n",
       " | Gender\n",
       " | Description\n",
       " | Category\n",
       " | Type\n",
       " | 
\n",
       "  \n",
       "  \n",
       "    \n",
       "      | 0\n",
       " | d7ed1d64-2c65-427f-9ae4-eb4aaa3e2389.jpg\n",
       " | Stylish and Trendy Tank Top with Celestial Design\n",
       " | M\n",
       " | F\n",
       " | This white tank top is a stylish and trendy pi...\n",
       " | Tops\n",
       " | Casual\n",
       " | 
\n",
       "    \n",
       "      | 1\n",
       " | 5c1b7a77-1fa3-4af8-9722-cd38e45d89da.jpg\n",
       " | Classic White Sweatshirt\n",
       " | M\n",
       " | F\n",
       " | This classic white sweatshirt is a timeless pi...\n",
       " | Tops\n",
       " | Casual\n",
       " | 
\n",
       "    \n",
       "      | 2\n",
       " | b2e084c7-e3a0-4182-8671-b908544a7cf2.jpg\n",
       " | Grey T-shirt\n",
       " | M\n",
       " | Unisex\n",
       " | This is a short-sleeved, crew neck t-shirt tha...\n",
       " | T-Shirt\n",
       " | Casual\n",
       " | 
\n",
       "    \n",
       "      | 3\n",
       " | 87846aa9-86cc-404a-af2c-7e8fe941081d.jpg\n",
       " | Long-Sleeved V-Neck Shirt\n",
       " | L\n",
       " | U\n",
       " | A long-sleeved, V-neck shirt with a solid purp...\n",
       " | Tops\n",
       " | Casual\n",
       " | 
\n",
       "    \n",
       "      | 4\n",
       " | 04fa06fb-d71a-4293-9804-fe799375a682.jpg\n",
       " | Silver Metallic Buckle Sandals\n",
       " | L\n",
       " | F\n",
       " | These silver metallic buckle sandals feature a...\n",
       " | Shoes\n",
       " | Casual\n",
       " | 
\n",
       "  \n",
       "
\n",
       "
\n",
       "\n",
       "
\n",
       "  \n",
       "    \n",
       "      | \n",
       " | Filename\n",
       " | Title\n",
       " | Size\n",
       " | Gender\n",
       " | Description\n",
       " | Category\n",
       " | Type\n",
       " | vector\n",
       " | 
\n",
       "  \n",
       "  \n",
       "    \n",
       "      | 0\n",
       " | d7ed1d64-2c65-427f-9ae4-eb4aaa3e2389.jpg\n",
       " | Stylish and Trendy Tank Top with Celestial Design\n",
       " | M\n",
       " | F\n",
       " | This white tank top is a stylish and trendy pi...\n",
       " | Tops\n",
       " | Casual\n",
       " | [-0.06355423, 0.02385288, 0.03382446, -0.00212...\n",
       " | 
\n",
       "    \n",
       "      | 1\n",
       " | 5c1b7a77-1fa3-4af8-9722-cd38e45d89da.jpg\n",
       " | Classic White Sweatshirt\n",
       " | M\n",
       " | F\n",
       " | This classic white sweatshirt is a timeless pi...\n",
       " | Tops\n",
       " | Casual\n",
       " | [-0.011691423, 0.049270794, 0.030319242, -0.02...\n",
       " | 
\n",
       "    \n",
       "      | 2\n",
       " | b2e084c7-e3a0-4182-8671-b908544a7cf2.jpg\n",
       " | Grey T-shirt\n",
       " | M\n",
       " | Unisex\n",
       " | This is a short-sleeved, crew neck t-shirt tha...\n",
       " | T-Shirt\n",
       " | Casual\n",
       " | [-0.010150914, 0.09637642, -0.0012558334, 0.04...\n",
       " | 
\n",
       "    \n",
       "      | 3\n",
       " | 87846aa9-86cc-404a-af2c-7e8fe941081d.jpg\n",
       " | Long-Sleeved V-Neck Shirt\n",
       " | L\n",
       " | U\n",
       " | A long-sleeved, V-neck shirt with a solid purp...\n",
       " | Tops\n",
       " | Casual\n",
       " | [-0.058710814, 0.053951632, -0.047531255, 0.03...\n",
       " | 
\n",
       "    \n",
       "      | 4\n",
       " | 04fa06fb-d71a-4293-9804-fe799375a682.jpg\n",
       " | Silver Metallic Buckle Sandals\n",
       " | L\n",
       " | F\n",
       " | These silver metallic buckle sandals feature a...\n",
       " | Shoes\n",
       " | Casual\n",
       " | [0.0123484805, 0.02398385, 0.059779372, -0.006...\n",
       " | 
\n",
       "    \n",
       "      | 5\n",
       " | 8f576f1a-839d-4fb2-a224-a4700b2d05da.jpg\n",
       " | Orange Long Sleeve T-Shirt\n",
       " | S\n",
       " | U\n",
       " | This long sleeve t-shirt is made of a lightwei...\n",
       " | Tops\n",
       " | Casual\n",
       " | [-0.016778396, 0.06664105, -0.010597771, 0.052...\n",
       " | 
\n",
       "    \n",
       "      | 6\n",
       " | e976a8f6-6731-485f-8a9a-2872a5208818.jpg\n",
       " | Green T-Shirt\n",
       " | L\n",
       " | M\n",
       " | The green t-shirt is a relaxed fit with a V-ne...\n",
       " | Tops\n",
       " | Casual\n",
       " | [0.015946185, 0.07543863, 0.032447945, 0.00442...\n",
       " | 
\n",
       "    \n",
       "      | 7\n",
       " | bbf0d9c7-663d-46d1-a9f8-66e8e5678541.jpg\n",
       " | White Quarter Zip Pullover Top\n",
       " | L\n",
       " | U\n",
       " | This pullover top has a quarter zip closure at...\n",
       " | Tops\n",
       " | Casual\n",
       " | [-0.05592024, 0.03401515, -0.0057456787, 0.030...\n",
       " | 
\n",
       "    \n",
       "      | 8\n",
       " | e25a7faa-7a49-4e72-a7ef-e74427f77784.jpg\n",
       " | Aviator Sunglasses T-Shirt\n",
       " | M\n",
       " | F\n",
       " | This sleeveless red T-shirt features a large b...\n",
       " | Tops\n",
       " | Casual\n",
       " | [-0.040058654, 0.090077035, -0.020889325, 0.03...\n",
       " | 
\n",
       "    \n",
       "      | 9\n",
       " | d995ac1f-fbd0-482c-a308-dafb6a93cfd0.jpg\n",
       " | Beige Top\n",
       " | M\n",
       " | F\n",
       " | This beige top has a boat neck, a pocket in fr...\n",
       " | Tops\n",
       " | Casual\n",
       " | [-0.020475557, 0.032990508, 0.0145089375, -0.0...\n",
       " | 
\n",
       "  \n",
       "
\n",
       "
\n",
       "\n",
       "
\n",
       "  \n",
       "    \n",
       "      | \n",
       " | Filename\n",
       " | Title\n",
       " | Size\n",
       " | Gender\n",
       " | Description\n",
       " | Category\n",
       " | Type\n",
       " | vector\n",
       " | _distance\n",
       " | 
\n",
       "  \n",
       "  \n",
       "    \n",
       "      | 0\n",
       " | 7a6e7ddb-ac03-4d78-9d71-75320e5913da.jpg\n",
       " | Light-Washed Jeans\n",
       " | S\n",
       " | F\n",
       " | These light-washed jeans are a pair of stylish...\n",
       " | Jeans\n",
       " | Casual\n",
       " | [-0.063473865, 0.060125574, 0.02189098, -0.002...\n",
       " | 0.407864\n",
       " | 
\n",
       "    \n",
       "      | 1\n",
       " | 7a6e7ddb-ac03-4d78-9d71-75320e5913da.jpg\n",
       " | Light-Washed Jeans\n",
       " | S\n",
       " | F\n",
       " | These light-washed jeans are a pair of stylish...\n",
       " | Jeans\n",
       " | Casual\n",
       " | [-0.063473865, 0.060125574, 0.02189098, -0.002...\n",
       " | 0.407864\n",
       " | 
\n",
       "    \n",
       "      | 2\n",
       " | 8db4d594-4688-4447-a5b9-cd5c68b1870e.jpg\n",
       " | Black Pant\n",
       " | S\n",
       " | U\n",
       " | This pair of black pants appears to be made of...\n",
       " | Pants\n",
       " | Casual\n",
       " | [-0.014388849, 0.082536675, 0.011225891, -0.00...\n",
       " | 0.412849\n",
       " | 
\n",
       "  \n",
       "
\n",
       "
\n",
       "\n",
       "
\n",
       "  \n",
       "    \n",
       "      | \n",
       " | filename\n",
       " | Description\n",
       " | vector\n",
       " | Title\n",
       " | Size\n",
       " | Category\n",
       " | Type\n",
       " | Gender\n",
       " | _score\n",
       " | 
\n",
       "  \n",
       "  \n",
       "    \n",
       "      | 0\n",
       " | 0e20abb0-d56a-4f83-b254-9db70fb92794.jpg\n",
       " | This shirt is a long-sleeved, black and grey s...\n",
       " | [-0.045025624, 0.041662034, -0.0006505165, 0.0...\n",
       " | Black and Grey Striped Shirt with Pink Accents\n",
       " | S\n",
       " | Tops\n",
       " | Casual\n",
       " | F\n",
       " | 52.089787\n",
       " | 
\n",
       "    \n",
       "      | 1\n",
       " | 10e51a41-3729-472e-b288-f3b8f895577a.jpg\n",
       " | This plaid shirt features a classic plaid patt...\n",
       " | [-0.044541422, 0.05006024, -0.04257077, 0.0401...\n",
       " | Plaid shirt with embroidered hem\n",
       " | L\n",
       " | Tops\n",
       " | Casual\n",
       " | F\n",
       " | 35.616638\n",
       " | 
\n",
       "    \n",
       "      | 2\n",
       " | bd289469-607c-45df-bad3-ed1057e277bf.jpg\n",
       " | This is a red and black plaid skirt with a cla...\n",
       " | [-0.021971317, 0.12156065, -0.002515854, 0.034...\n",
       " | Red and Black Plaid Skirt\n",
       " | S\n",
       " | Skirts\n",
       " | Casual\n",
       " | F\n",
       " | 34.096905\n",
       " | 
\n",
       "    \n",
       "      | 3\n",
       " | 64c5d20e-7cec-4965-8c50-315389f44b22.jpg\n",
       " | These light washed blue jeans are distressed w...\n",
       " | [-0.07545003, 0.05375045, 0.00353413, -0.01551...\n",
       " | Light Washed Blue Ripped Jeans\n",
       " | L\n",
       " | Jeans\n",
       " | Casual\n",
       " | M\n",
       " | 29.116760\n",
       " | 
\n",
       "    \n",
       "      | 4\n",
       " | b0c03127-9dfb-4573-8934-1958396937bf.jpg\n",
       " | This red flannel plaid shirt has a classic and...\n",
       " | [-0.03013145, 0.060508166, -0.03642542, 0.0301...\n",
       " | Red Flannel Plaid Shirt\n",
       " | S\n",
       " | Shirts\n",
       " | Casual\n",
       " | M\n",
       " | 28.444168\n",
       " | 
\n",
       "  \n",
       "
\n",
       "
\n",
       "\n",
       "
\n",
       "  \n",
       "    \n",
       "      | \n",
       " | filename\n",
       " | Description\n",
       " | vector\n",
       " | Title\n",
       " | Size\n",
       " | Category\n",
       " | Type\n",
       " | Gender\n",
       " | _relevance_score\n",
       " | 
\n",
       "  \n",
       "  \n",
       "    \n",
       "      | 0\n",
       " | 0e20abb0-d56a-4f83-b254-9db70fb92794.jpg\n",
       " | This shirt is a long-sleeved, black and grey s...\n",
       " | [-0.045025624, 0.041662034, -0.0006505165, 0.0...\n",
       " | Black and Grey Striped Shirt with Pink Accents\n",
       " | S\n",
       " | Tops\n",
       " | Casual\n",
       " | F\n",
       " | 0.744141\n",
       " | 
\n",
       "    \n",
       "      | 1\n",
       " | 10e51a41-3729-472e-b288-f3b8f895577a.jpg\n",
       " | This plaid shirt features a classic plaid patt...\n",
       " | [-0.044541422, 0.05006024, -0.04257077, 0.0401...\n",
       " | Plaid shirt with embroidered hem\n",
       " | L\n",
       " | Tops\n",
       " | Casual\n",
       " | F\n",
       " | 0.735352\n",
       " | 
\n",
       "    \n",
       "      | 2\n",
       " | 300a4125-acec-46aa-b0b9-68b4b2142c70.jpg\n",
       " | These plaid shorts are a stylish and comfortab...\n",
       " | [-0.002765015, 0.030705247, -0.010424565, -0.0...\n",
       " | Plaid Shorts\n",
       " | S\n",
       " | Pants\n",
       " | Casual\n",
       " | M\n",
       " | 0.728516\n",
       " | 
\n",
       "    \n",
       "      | 3\n",
       " | bd289469-607c-45df-bad3-ed1057e277bf.jpg\n",
       " | This is a red and black plaid skirt with a cla...\n",
       " | [-0.021971317, 0.12156065, -0.002515854, 0.034...\n",
       " | Red and Black Plaid Skirt\n",
       " | S\n",
       " | Skirts\n",
       " | Casual\n",
       " | F\n",
       " | 0.699219\n",
       " | 
\n",
       "    \n",
       "      | 4\n",
       " | b0c03127-9dfb-4573-8934-1958396937bf.jpg\n",
       " | This red flannel plaid shirt has a classic and...\n",
       " | [-0.03013145, 0.060508166, -0.03642542, 0.0301...\n",
       " | Red Flannel Plaid Shirt\n",
       " | S\n",
       " | Shirts\n",
       " | Casual\n",
       " | M\n",
       " | 0.679199\n",
       " | 
\n",
       "  \n",
       "
\n",
       "