Jeff Tang caf98ec76d unified script supporting 6 FT configs - quantized or not, peft/fft, cot or not (except q fft)		9 ヶ月前
..
data	6815255595 folder struc refactoring	9 ヶ月前
eval	9c294dfacf eval/llama_eval.sh data path	9 ヶ月前
fine-tuning	caf98ec76d unified script supporting 6 FT configs - quantized or not, peft/fft, cot or not (except q fft)	9 ヶ月前
quickstart	99ead57fb6 4 READMEs; requirements	9 ヶ月前
README.md	99ead57fb6 4 READMEs; requirements	9 ヶ月前

Text2SQL: Evaluating and Fine-tuning Llama Models

This folder contains scripts to:

Evaluate Llama (original and fine-tuned) models on the Text2SQL task using the popular BIRD dataset in 3 simple steps;
Generate fine-tuning datasets (both with and without CoT reasoning) and fine-tuning Llama 3.1 8B with the datasets, gaining a 165% (with no reasoning) and 209% (with reasoning) accuracy improvement over the original model.

Our end goal is to maximize the accuracy of Llama models on the Text2SQL task. To do so we need to first evaluate the current state of the art Llama models on the task, then apply fine-tuning, agent and other approaches to evaluate and improve Llama's performance.

Structure:

data: contains the scripts to download the BIRD TRAIN and DEV datasets;
eval: contains the scripts to evaluate Llama models (original and fine-tuned) on the BIRD dataset;
fine-tune: contains the scripts to generate non-CoT and CoT datasets based on the BIRD TRAIN set and to fine-tune Llama models using the datasets;
quickstart: contains a notebook to ask Llama 3.3 to convert natural language queries into SQL queries.

Next Steps

Try GRPO RFT to further improve the accuracy.
Fine-tune Llama 3.3 70b and Llama 4 models.
Use torchtune.
Try agentic workflow.
Expand the eval to support other enterprise databases.

README.md

Text2SQL: Evaluating and Fine-tuning Llama Models

Structure:

Next Steps