Amir Youssefi 2bd662c67d adding vllm eval files and updating requirements.txt		11 months ago
..
data	6815255595 folder struc refactoring	11 months ago
eval	2bd662c67d adding vllm eval files and updating requirements.txt	11 months ago
fine-tuning	2cdfbf0593 READMEs update	11 months ago
quickstart	3c23112ed2 fixing github web rendering of the notebook	11 months ago
README.md	4bb7faa35c READMEs update	11 months ago

Text2SQL: Evaluating and Fine-tuning Llama Models with CoT

This folder contains scripts to:

Evaluate Llama (original and fine-tuned) models on the Text2SQL task using the popular BIRD dataset.
Generate two supervised fine-tuning (SFT) datasets (with and without CoT) and fine-tuning Llama 3.1 8B with the datasets, using different SFT options: with or without CoT, using quantization or not, full fine-tuning (FFT) or parameter-efficient fine-tuning (PEFT). The non-quantized PEFT CoT SFT has the most performance gains: from 39.47% of the original Llama 3.1 8B model to 43.35%. (Note: the results are based on 3 epochs of SFT.)

Our end goal is to maximize the accuracy of Llama models on the Text2SQL task. To do so we need to first evaluate the current state of the art Llama models on the task, then apply fine-tuning, agent and other approaches to evaluate and improve Llama's performance.

Structure:

data: contains scripts to download the BIRD TRAIN and DEV datasets;
eval: contains scripts to evaluate Llama models (original and fine-tuned) on the BIRD dataset;
fine-tune: contains scripts to generate non-CoT and CoT datasets based on the BIRD TRAIN set and to supervised fine-tune Llama models using the datasets, with different SFT options (quantization or not, full fine-tuning or parameter-efficient fine-tuning);
quickstart: contains a notebook to ask Llama 3.3 to convert natural language queries into SQL queries.

Next Steps

Hyper-parameter tuning of the current SFT scripts.
Try GRPO reinforcement learning to further improve the accuracy.
Fine-tune Llama 3.3 70B and Llama 4 models.
Try agentic workflow.
Expand the eval to support other enterprise databases.

README.md

Text2SQL: Evaluating and Fine-tuning Llama Models with CoT

Structure:

Next Steps