|
@@ -1,5 +1,6 @@
|
|
|
# Local Inference
|
|
|
|
|
|
+## Multimodal Inference
|
|
|
For Multi-Modal inference we have added [multi_modal_infer.py](multi_modal_infer.py) which uses the transformers library
|
|
|
|
|
|
The way to run this would be
|
|
@@ -7,7 +8,10 @@ The way to run this would be
|
|
|
python multi_modal_infer.py --image_path "./resources/image.jpg" --prompt_text "Describe this image" --temperature 0.5 --top_p 0.8 --model_name "meta-llama/Llama-3.2-11B-Vision-Instruct"
|
|
|
```
|
|
|
|
|
|
+## Text-only Inference
|
|
|
For local inference we have provided an [inference script](inference.py). Depending on the type of finetuning performed during training the [inference script](inference.py) takes different arguments.
|
|
|
+
|
|
|
+
|
|
|
To finetune all model parameters the output dir of the training has to be given as --model_name argument.
|
|
|
In the case of a parameter efficient method like lora the base model has to be given as --model_name and the output dir of the training has to be given as --peft_model argument.
|
|
|
Additionally, a prompt for the model in the form of a text file has to be provided. The prompt file can either be piped through standard input or given as --prompt_file parameter.
|