|
@@ -66,7 +66,13 @@
|
|
|
"id": "01fbc052-b633-4d7c-a6b8-e8b70c484697",
|
|
|
"metadata": {},
|
|
|
"source": [
|
|
|
- "#### All the imports"
|
|
|
+ "#### All the imports\n",
|
|
|
+ "\n",
|
|
|
+ "We import all the libraries here. \n",
|
|
|
+ "\n",
|
|
|
+ "- PIL: For handling images to be passed to our Llama model\n",
|
|
|
+ "- Huggingface Tranformers: For running the model\n",
|
|
|
+ "- Concurrent Library: Because 405B suggested its useful for speedups and we want to look smart when doing OS stuff :) "
|
|
|
]
|
|
|
},
|
|
|
{
|
|
@@ -99,7 +105,11 @@
|
|
|
"id": "544c6687-e174-4490-b221-4b3fbed080b3",
|
|
|
"metadata": {},
|
|
|
"source": [
|
|
|
- "#### Clean Corrupt Images"
|
|
|
+ "#### Clean Corrupt Images\n",
|
|
|
+ "\n",
|
|
|
+ "Cleaning corruption is a task for AGI but we can handle the corrupt images in our dataset for now with some concurrency for fast checking. \n",
|
|
|
+ "\n",
|
|
|
+ "This takes a few moments so it might be a good idea to take a small break and socialise for a good change. "
|
|
|
]
|
|
|
},
|
|
|
{
|
|
@@ -180,6 +190,14 @@
|
|
|
]
|
|
|
},
|
|
|
{
|
|
|
+ "cell_type": "markdown",
|
|
|
+ "id": "d339c0d1",
|
|
|
+ "metadata": {},
|
|
|
+ "source": [
|
|
|
+ "Let's load in the Meta-Data of the images and remove the rows with the corrupt images"
|
|
|
+ ]
|
|
|
+ },
|
|
|
+ {
|
|
|
"cell_type": "code",
|
|
|
"execution_count": 7,
|
|
|
"id": "05c65335-ad2f-4735-a25b-d75adb195113",
|
|
@@ -295,6 +313,14 @@
|
|
|
]
|
|
|
},
|
|
|
{
|
|
|
+ "cell_type": "markdown",
|
|
|
+ "id": "cc899cf1",
|
|
|
+ "metadata": {},
|
|
|
+ "source": [
|
|
|
+ "We can now \"clean\" up the dataframe by subtracting the corrupt images."
|
|
|
+ ]
|
|
|
+ },
|
|
|
+ {
|
|
|
"cell_type": "code",
|
|
|
"execution_count": 9,
|
|
|
"id": "1f1e37bb-b625-44ac-b1bb-c2361b5edbf9",
|
|
@@ -340,7 +366,11 @@
|
|
|
"jp-MarkdownHeadingCollapsed": true
|
|
|
},
|
|
|
"source": [
|
|
|
- "## EDA"
|
|
|
+ "## EDA\n",
|
|
|
+ "\n",
|
|
|
+ "Now that we got rid of corruption we can proceed to building a great society with checking our dataset :) \n",
|
|
|
+ "\n",
|
|
|
+ "Let's start by double-checking any empty values"
|
|
|
]
|
|
|
},
|
|
|
{
|
|
@@ -499,6 +529,16 @@
|
|
|
]
|
|
|
},
|
|
|
{
|
|
|
+ "cell_type": "markdown",
|
|
|
+ "id": "c65411e6",
|
|
|
+ "metadata": {},
|
|
|
+ "source": [
|
|
|
+ "#### Understanding the Label Distribution \n",
|
|
|
+ "\n",
|
|
|
+ "The existing dataset comes with multi-labels, let's take a look at all categories:"
|
|
|
+ ]
|
|
|
+ },
|
|
|
+ {
|
|
|
"cell_type": "code",
|
|
|
"execution_count": 15,
|
|
|
"id": "fea1f2d8-48c4-4b0e-9790-3427c2517e4e",
|
|
@@ -570,6 +610,14 @@
|
|
|
]
|
|
|
},
|
|
|
{
|
|
|
+ "cell_type": "markdown",
|
|
|
+ "id": "1cc50c67",
|
|
|
+ "metadata": {},
|
|
|
+ "source": [
|
|
|
+ "If we had more ~~prompts~~ time, this would be a fancier plot but for now let's take a look at the distribution skew to understand what's in our dataset:"
|
|
|
+ ]
|
|
|
+ },
|
|
|
+ {
|
|
|
"cell_type": "code",
|
|
|
"execution_count": 17,
|
|
|
"id": "14a86ee1-d419-495b-86b0-7ef193e81b4a",
|
|
@@ -598,6 +646,17 @@
|
|
|
]
|
|
|
},
|
|
|
{
|
|
|
+ "cell_type": "markdown",
|
|
|
+ "id": "a0861297",
|
|
|
+ "metadata": {},
|
|
|
+ "source": [
|
|
|
+ "Let's start with some more cleanup:\n",
|
|
|
+ "\n",
|
|
|
+ "- Remove kids clothing since that is a smaller subset\n",
|
|
|
+ "- Let's use our lack of understanding of fashion to reduce categories and also make our lives with pre-processing easier"
|
|
|
+ ]
|
|
|
+ },
|
|
|
+ {
|
|
|
"cell_type": "code",
|
|
|
"execution_count": 18,
|
|
|
"id": "48a00d85-011d-4632-af7d-d34c8dee6a2c",
|
|
@@ -752,6 +811,14 @@
|
|
|
]
|
|
|
},
|
|
|
{
|
|
|
+ "cell_type": "markdown",
|
|
|
+ "id": "c2793936",
|
|
|
+ "metadata": {},
|
|
|
+ "source": [
|
|
|
+ "For once, lack of fashion knowledge is useful-we can reduce our work by creating less categories. Nicely organised just like an coder's wardrobe"
|
|
|
+ ]
|
|
|
+ },
|
|
|
+ {
|
|
|
"cell_type": "code",
|
|
|
"execution_count": 20,
|
|
|
"id": "99115476-9862-4b92-83f4-dd0145e1ee86",
|
|
@@ -825,6 +892,14 @@
|
|
|
]
|
|
|
},
|
|
|
{
|
|
|
+ "cell_type": "markdown",
|
|
|
+ "id": "a3e0061a",
|
|
|
+ "metadata": {},
|
|
|
+ "source": [
|
|
|
+ "This is the part that makes Thanos happy, we will balance our universe of clothes by randomly sampling."
|
|
|
+ ]
|
|
|
+ },
|
|
|
+ {
|
|
|
"cell_type": "code",
|
|
|
"execution_count": 22,
|
|
|
"id": "43b65158-1865-4535-bba0-610b32811c82",
|
|
@@ -934,7 +1009,15 @@
|
|
|
"id": "5798ee82-e237-4dd4-8a07-7777694a8981",
|
|
|
"metadata": {},
|
|
|
"source": [
|
|
|
- "## Synthetic Labelling using Llama 3.2"
|
|
|
+ "## Synthetic Labelling using Llama 3.2\n",
|
|
|
+ "\n",
|
|
|
+ "All the effort so far was to prepare our dataset for labelling. \n",
|
|
|
+ "\n",
|
|
|
+ "At this stage, we are ready to start labelling the images using Llama-3.2 models. We will use 11B here for testing. \n",
|
|
|
+ "\n",
|
|
|
+ "For our rich readers, we suggest testing 90B as an assignment. Although you will find that 11B is a great candidate for this model. \n",
|
|
|
+ "\n",
|
|
|
+ "Read more about the model capabilites [here](https://www.llama.com/docs/how-to-guides/vision-capabilities/)"
|
|
|
]
|
|
|
},
|
|
|
{
|
|
@@ -981,6 +1064,14 @@
|
|
|
]
|
|
|
},
|
|
|
{
|
|
|
+ "cell_type": "markdown",
|
|
|
+ "id": "2d97ec1b",
|
|
|
+ "metadata": {},
|
|
|
+ "source": [
|
|
|
+ "Feel free to randomly grab any example from the `ls` command above. This shirt is colorful enough for us to use-so we will go with the current example"
|
|
|
+ ]
|
|
|
+ },
|
|
|
+ {
|
|
|
"cell_type": "code",
|
|
|
"execution_count": 27,
|
|
|
"id": "8112f7bb-377c-4556-90a6-3e576321c152",
|
|
@@ -1028,6 +1119,23 @@
|
|
|
]
|
|
|
},
|
|
|
{
|
|
|
+ "cell_type": "markdown",
|
|
|
+ "id": "f9d3d44f",
|
|
|
+ "metadata": {},
|
|
|
+ "source": [
|
|
|
+ "#### Labelling Prompt\n",
|
|
|
+ "\n",
|
|
|
+ "For anyone who feels strongly about Prompt Engineering-this section is for you. The drama in the first prompt stems from constant errors encountered when running the model. \n",
|
|
|
+ "\n",
|
|
|
+ "Suggested approach:\n",
|
|
|
+ "\n",
|
|
|
+ "- Run a simple prompt on an image\n",
|
|
|
+ "- See output and iterate\n",
|
|
|
+ "\n",
|
|
|
+ "After painfully trying this a few times, we learn that for some reason the model doesn't follow JSON formatting unless it's strongly urged. So we fix this with the dramatic prompt:"
|
|
|
+ ]
|
|
|
+ },
|
|
|
+ {
|
|
|
"cell_type": "code",
|
|
|
"execution_count": 30,
|
|
|
"id": "1de59227-6042-441b-a1f8-b19ce83f7c45",
|