Thomas Robinson 2668bf4c35 Address feedback not possible before launch in LG3 recipe and dataset file (#606)		1 anno fa
..
README.md	acf0a74297 Updates for commit review	1 anno fa
__init__.py	6d449a859b New folder structure (#1)	1 anno fa
inference.py	a46d0422cf Add note about LG3 finetuning notebook	1 anno fa
llama_guard_customization_via_prompting_and_fine_tuning.ipynb	2668bf4c35 Address feedback not possible before launch in LG3 recipe and dataset file (#606)	1 anno fa

Meta Llama Guard demo

Meta Llama Guard is a language model that provides input and output guardrails for LLM inference. For more details and model cards, please visit the PurpleLlama repository.

This folder contains an example file to run inference with a locally hosted model, either using the Hugging Face Hub or a local path.

Requirements

Access to Llama guard model weights on Hugging Face. To get access, follow the steps described here
Llama recipes package and it's dependencies installed

Llama Guard inference script

For testing, you can add User or User/Agent interactions into the prompts list and the run the script to verify the results. When the conversation has one or more Agent responses, it's considered of type agent.

    prompts: List[Tuple[List[str], AgentType]] = [
        (["<Sample user prompt>"], AgentType.USER),

        (["<Sample user prompt>",
        "<Sample agent response>"], AgentType.AGENT),

        (["<Sample user prompt>",
        "<Sample agent response>",
        "<Sample user reply>",
        "<Sample agent response>",], AgentType.AGENT),

    ]

The complete prompt is built with the build_custom_prompt function, defined in prompt_format.py. The file contains the default Meta Llama Guard categories. These categories can adjusted and new ones can be added, as described in the research paper, on section 4.5 Studying the adaptability of the model.

To run the samples, with all the dependencies installed, execute this command:

python recipes/responsible_ai/llama_guard/inference.py

This is the output:

['<Sample user prompt>']
> safe

==================================

['<Sample user prompt>', '<Sample agent response>']
> safe

==================================

['<Sample user prompt>', '<Sample agent response>', '<Sample user reply>', '<Sample agent response>']
> safe

==================================

To run it with a local model, you can use the model_id param in the inference script:

python recipes/responsible_ai/llama_guard/inference.py --model_id=/home/ubuntu/models/llama3/Llama-Guard-3-8B/ --llama_guard_version=LLAMA_GUARD_3

Note: Make sure to also add the llama_guard_version; by default it uses LLAMA_GUARD_3

Inference Safety Checker

When running the regular inference script with prompts, Meta Llama Guard will be used as a safety checker on the user prompt and the model output. If both are safe, the result will be shown, else a message with the error will be shown, with the word unsafe and a comma separated list of categories infringed. Meta Llama Guard is always loaded quantized using Hugging Face Transformers library with bitsandbytes.

In this case, the default categories are applied by the tokenizer, using the apply_chat_template method.

Use this command for testing with a quantized Llama model, modifying the values accordingly:

python examples/inference.py --model_name <path_to_regular_llama_model> --prompt_file <path_to_prompt_file> --quantization 8bit --enable_llamaguard_content_safety

Llama Guard 3 Finetuning & Customization

The safety categories in Llama Guard 3 can be tuned for specific application needs. Existing categories can be removed and new categories can be added to the taxonomy. The Llama Guard Customization notebook walks through the process.

README.md

Meta Llama Guard demo

Requirements

Llama Guard inference script

Inference Safety Checker

Llama Guard 3 Finetuning & Customization