|
11 tháng trước cách đây | |
---|---|---|
.. | ||
notebooks | 11 tháng trước cách đây | |
scripts | 11 tháng trước cách đây | |
README.md | 11 tháng trước cách đây |
Llama-3.2-11B
model:Credit and Thanks to List of models and libraries used in the showcase:
Firstly, thanks to the author here for providing this dataset on which we base our excercise []()
Here's the detailed outline:
The dataset consists of 5000 images with some classification.
The first half is preparing the dataset for labeling:
Second Half consists of Labeling the dataset. We are bound by an interesting constraint here, 11B model can only caption one image at a time:
After running the script on the entire dataset, we have more data cleaning to perform:
Even after our lengthy (apart from other things) prompt, the model still hallucinates categories and label-we need to address this
Now, we are ready to try our vector db pipeline:
With the cleaned descriptions and dataset, we can now store these in a vector-db
You will note that we are not using the categorization from our model-this is by design to show how RAG can simplify a lot of things.
We try the approach with different retrieval methods.
Finally, we can bring this all together in a Gradio App.
Task: We can further improve the description prompt. You will notice sometimes the description starts with the title of the cloth which causes in retrieval of "similar" clothes instead of "complementary" items