1 год назад · 323783f944
--- a/end-to-end-use-cases/README.md
+++ b/end-to-end-use-cases/README.md
@@ -15,6 +15,50 @@
 
				 	<a href="https://github.com/meta-llama/llama-prompt-ops"><img alt="Llama Tools Syntethic Data Kit" src="https://img.shields.io/badge/Llama_Tools-llama--prompt--ops-orange?logo=meta" /></a>
			
 
				 </p>
			
 
				 
			
 
				+
			
 
				+
			
 
				+
			
 
				+## [Building an Intelligent WhatsApp Bot with Llama 4 APIs](./whatsapp-llama4-bot/README.md)
			
 
				+### A Step-by-Step Guide
			
 
				+
			
 
				+Create a WhatsApp bot that leverages the power of Llama 4 APIs to provide intelligent and interactive responses. This guide will walk you through the process of building a bot that supports text, image, and audio interactions, making it versatile for various use cases.
			
 
				+
			
 
				+- **Text Interaction**: Respond to text messages with accurate and contextually relevant answers.
			
 
				+- **Image Reasoning**: Analyze images to provide insights, descriptions, or answers related to the content.
			
 
				+- **Audio-to-Audio Interaction**: Transcribe audio messages to text, process them, and convert back to audio for seamless voice-based interaction.
			
 
				+
			
 
				+Get started with building your own WhatsApp bot using Llama 4 APIs today!
			
 
				+
			
 
				+
			
 
				+
			
 
				+
			
 
				+## [Research Paper Analyzer with Llama4 Maverick](./research_paper_analyzer/README.md)
			
 
				+### Analyze Research Papers with Ease
			
 
				+
			
 
				+Leverage Llama4 Maverick to retrieve references from an arXiv paper and ingest all their content for question-answering.
			
 
				+
			
 
				+- **Long Context Length**: Process entire papers at once.
			
 
				+- **Comprehensive Analysis**: Get insights, descriptions, or answers related to the content.
			
 
				+
			
 
				+
			
 
				+Get started with analyzing research papers using Llama4 Maverick today!
			
 
				+
			
 
				+
			
 
				+
			
 
				+
			
 
				+## [Book Character Mind Map With Llama4 Maverick](./book_character_mindmap/README.md)
			
 
				+### Explore Book Characters and Storylines
			
 
				+
			
 
				+Use Llama4 Maverick to process entire books at once and visualize character relationships and storylines.
			
 
				+
			
 
				+- **Interactive Mind Maps**: Visualize relationships between characters and plot elements.
			
 
				+- **Book Summaries**: Get concise overviews of plots and themes.
			
 
				+
			
 
				+Discover new insights into your favorite books!
			
 
				+
			
 
				+
			
 
				+
			
 
				+
			
 
				 ## [Agentic Tutorial](./agents/):
			
 
				 
			
 
				 ### 101 and 201 tutorials on performing Tool Calling and building an Agentic Workflow using Llama Models
			
@@ -50,10 +94,15 @@ Workflow showcasing how to use multiple Llama models to go from any PDF to a Pod
 
				 ### Building a Llama 3 Enabled WhatsApp Chatbot
			
 
				 This step-by-step tutorial shows how to use the [WhatsApp Business API](https://developers.facebook.com/docs/whatsapp/cloud-api/overview) to build a Llama 3 enabled WhatsApp chatbot.
			
 
				 
			
 
				+
			
 
				 ## [Messenger Chatbot](./customerservice_chatbots/messenger_chatbot/messenger_llama3.md):
			
 
				 
			
 
				 ### Building a Llama 3 Enabled Messenger Chatbot
			
 
				 This step-by-step tutorial shows how to use the [Messenger Platform](https://developers.facebook.com/docs/messenger-platform/overview) to build a Llama 3 enabled Messenger chatbot.
			
 
				 
			
 
				+
			
 
				 ### RAG Chatbot Example (running [locally](./customerservice_chatbots/RAG_chatbot/RAG_Chatbot_Example.ipynb)
			
 
				 A complete example of how to build a Llama 3 chatbot hosted on your browser that can answer questions based on your own data using retrieval augmented generation (RAG).
			
 
				+
			
 
				+
			
 
				+
			
--- a/end-to-end-use-cases/whatsapp_llama_4_bot/.env
+++ b/end-to-end-use-cases/whatsapp_llama_4_bot/.env
@@ -0,0 +1,18 @@
 
				+# WhatsApp Business Phone Number ID (NOT the phone number itself)
			
 
				+PHONE_NUMBER_ID="place your whatsapp phone number id"
			
 
				+
			
 
				+# Full URL to send WhatsApp messages (use correct version and phone number ID)
			
 
				+WHATSAPP_API_URL="place graphql request i.e. https://graph.facebook.com/v{version}/{phone_number_id}/messages"
			
 
				+
			
 
				+# Your custom backend/agent endpoint (e.g., for LLM-based processing)
			
 
				+AGENT_URL=https://your-agent-url.com/api
			
 
				+
			
 
				+LLAMA_API_KEY="place your LLAMA API Key"
			
 
				+
			
 
				+TOGETHER_API_KEY="place your Together API Key, In case you want to use Together, instead of Llama APIs"
			
 
				+
			
 
				+GROQ_API_KEY="place your Groq API Key - this is for SST and TTS"
			
 
				+
			
 
				+OPENAI_API_KEY="place your OpenAI Ke to run the client"
			
 
				+
			
 
				+META_ACCESS_TOKEN="please your WhatsApp generated Access token from the app"
			
--- a/end-to-end-use-cases/whatsapp_llama_4_bot/README.md
+++ b/end-to-end-use-cases/whatsapp_llama_4_bot/README.md
@@ -0,0 +1,117 @@
 
				+# WhatsApp and Llama 4 APIs : Build your own multi-modal chatbot
			
 
				+
			
 
				+Welcome to the WhatsApp Llama4 Bot ! This bot leverages the power of the Llama 4 APIs to provide intelligent and interactive responses to users via WhatsApp. It supports text, image, and audio interactions, making it a versatile tool for various use cases.
			
 
				+
			
 
				+
			
 
				+## Key Features
			
 
				+- **Text Interaction**: Users can send text messages to the bot, which are processed using the Llama4 APIs to generate accurate and contextually relevant responses.
			
 
				+- **Image Reasoning**: The bot can analyze images sent by users, providing insights, descriptions, or answers related to the image content.
			
 
				+- **Audio-to-Audio Interaction**: Users can send audio messages, which are transcribed to text, processed by the Llama4, and converted back to audio for a seamless voice-based interaction.
			
 
				+
			
 
				+
			
 
				+
			
 
				+## Technical Overview
			
 
				+
			
 
				+### Architecture
			
 
				+
			
 
				+- **FastAPI**: The bot is built using FastAPI, a modern web framework for building APIs with Python.
			
 
				+- **Asynchronous Processing**: Utilizes `httpx` for making asynchronous HTTP requests to external APIs, ensuring efficient handling of media files.
			
 
				+- **Environment Configuration**: Uses `dotenv` to manage environment variables, keeping sensitive information like API keys secure.
			
 
				+
			
 
				+Please refer below a high-level of architecture which explains the integrations :
			
 
				+![WhatsApp Llama4 Integration Diagram](src/docs/img/WhatApp_Llama4_integration.jpeg)
			
 
				+
			
 
				+
			
 
				+
			
 
				+
			
 
				+
			
 
				+### Important Integrations
			
 
				+
			
 
				+- **WhatsApp API**: Facilitates sending and receiving messages, images, and audio files. 
			
 
				+- **Llama4 Model**: Provides advanced natural language processing capabilities for generating responses.
			
 
				+- **Groq API**: Handles speech-to-text (STT) and text-to-speech (TTS) conversions, enabling the audio-to-audio feature.
			
 
				+
			
 
				+
			
 
				+
			
 
				+
			
 
				+
			
 
				+## Here are the steps to setup with WhatsApp Business Cloud API
			
 
				+
			
 
				+
			
 
				+First, open the [WhatsApp Business Platform Cloud API Get Started Guide](https://developers.facebook.com/docs/whatsapp/cloud-api/get-started#set-up-developer-assets) and follow the first four steps to:
			
 
				+
			
 
				+1. Add the WhatsApp product to your business app;
			
 
				+2. Add a recipient number;
			
 
				+3. Send a test message;
			
 
				+4. Configure a webhook to receive real time HTTP notifications.
			
 
				+
			
 
				+For the last step, you need to further follow the [Sample Callback URL for Webhooks Testing Guide](https://developers.facebook.com/docs/whatsapp/sample-app-endpoints) to create a free account on glitch.com to get your webhook's callback URL.
			
 
				+
			
 
				+Now open the [Meta for Develops Apps](https://developers.facebook.com/apps/) page and select the WhatsApp business app and you should be able to copy the curl command (as shown in the App Dashboard - WhatsApp - API Setup - Step 2 below) and run the command on a Terminal to send a test message to your WhatsApp.
			
 
				+
			
 
				+![](../../../src/docs/img/whatsapp_dashboard.jpg)
			
 
				+
			
 
				+Note down the "Temporary access token", "Phone number ID", and "a recipient phone number" in the API Setup page above, which will be used later.
			
 
				+
			
 
				+
			
 
				+
			
 
				+
			
 
				+
			
 
				+## Setup and Installation
			
 
				+
			
 
				+
			
 
				+
			
 
				+### Step 1: Clone the Repository
			
 
				+
			
 
				+```bash
			
 
				+git clone https://github.com/meta-llama/internal-llama-cookbook.git
			
 
				+cd internal-llama-cookbook/end-to-end-use-cases/whatsapp-llama4-bot
			
 
				+```
			
 
				+
			
 
				+### Step 2: Install Dependencies
			
 
				+
			
 
				+Ensure you have Python installed, then run the following command to install the required packages:
			
 
				+
			
 
				+```bash
			
 
				+pip install -r requirements.txt
			
 
				+```
			
 
				+
			
 
				+
			
 
				+
			
 
				+### Step 3: Configure Environment Variables
			
 
				+
			
 
				+Create a `.env` file in the project directory and add your API keys and other configuration details as follows:
			
 
				+
			
 
				+```plaintext
			
 
				+ACCESS_TOKEN=your_whatsapp_access_token
			
 
				+WHATSAPP_API_URL=your_whatsapp_api_url
			
 
				+TOGETHER_API_KEY=your_llama4_api_key
			
 
				+GROQ_API_KEY=your_groq_api_key
			
 
				+PHONE_NUMBER_ID=your_phone_number_id'
			
 
				+```
			
 
				+
			
 
				+
			
 
				+
			
 
				+### Step 4: Run the Application
			
 
				+
			
 
				+On your EC2 instance, run the following command on a Terminal to start the FastAPI server 
			
 
				+
			
 
				+```bash
			
 
				+uvicorn ec2_endpoints:app —host 0.0.0.0 —port 5000
			
 
				+```
			
 
				+
			
 
				+Note: If you use Amazon EC2 as your web server, make sure you have port 5000 added to your EC2 instance's security group's inbound rules.
			
 
				+
			
 
				+
			
 
				+
			
 
				+
			
 
				+## License
			
 
				+
			
 
				+This project is licensed under the MIT License.
			
 
				+
			
 
				+
			
 
				+## Contributing
			
 
				+
			
 
				+We welcome contributions to enhance the capabilities of this bot. Please feel free to submit issues or pull requests.
			
 
				+
			
 
				+
			
--- a/end-to-end-use-cases/whatsapp_llama_4_bot/ec2_endpoints.py
+++ b/end-to-end-use-cases/whatsapp_llama_4_bot/ec2_endpoints.py
@@ -0,0 +1,49 @@
 
				+from fastapi import FastAPI, HTTPException 
			
 
				+from fastapi.responses import FileResponse
			
 
				+from pydantic import BaseModel
			
 
				+from typing import Optional
			
 
				+from service import text_to_speech, get_llm_response, handle_image_message,handle_audio_message,send_audio_message
			
 
				+from enum import Enum
			
 
				+app = FastAPI()
			
 
				+
			
 
				+class TextToSpeechRequest(BaseModel):
			
 
				+    text: str
			
 
				+    output_path: Optional[str] = "reply.mp3"
			
 
				+
			
 
				+class TextToSpeechResponse(BaseModel):
			
 
				+    file_path: Optional[str]
			
 
				+    error: Optional[str] = None
			
 
				+
			
 
				+class KindEnum(str, Enum):
			
 
				+    audio = "audio"
			
 
				+    image = "image"
			
 
				+
			
 
				+class LLMRequest(BaseModel):
			
 
				+    user_input: str
			
 
				+    media_id: Optional[str] = None
			
 
				+    kind: Optional[KindEnum] = None
			
 
				+
			
 
				+
			
 
				+class LLMResponse(BaseModel):
			
 
				+    response: Optional[str]
			
 
				+    error: Optional[str] = None
			
 
				+
			
 
				+@app.post("/llm-response", response_model=LLMResponse)
			
 
				+async def api_llm_response(req: LLMRequest):
			
 
				+    text_message = req.user_input
			
 
				+    image_base64 = None
			
 
				+    if req.kind == KindEnum.image:
			
 
				+        image_base64 = await handle_image_message(req.media_id)
			
 
				+        result = get_llm_response(text_message, image_input=image_base64)
			
 
				+        # print(result)
			
 
				+    elif req.kind == KindEnum.audio:
			
 
				+        text_message = await handle_audio_message(req.media_id)
			
 
				+        result = get_llm_response(text_message)
			
 
				+        audio_path = text_to_speech(text=result, output_path="reply.mp3")
			
 
				+        return FileResponse(audio_path, media_type="audio/mpeg", filename="reply.mp3")
			
 
				+    else:
			
 
				+        result = get_llm_response(text_message)
			
 
				+    
			
 
				+    if result is None:
			
 
				+        return LLMResponse(response=None, error="LLM response generation failed.")
			
 
				+    return LLMResponse(response=result)
			
--- a/end-to-end-use-cases/whatsapp_llama_4_bot/ec2_services.py
+++ b/end-to-end-use-cases/whatsapp_llama_4_bot/ec2_services.py
@@ -0,0 +1,243 @@
 
				+from together import Together
			
 
				+from openai import OpenAI 
			
 
				+import os
			
 
				+import base64
			
 
				+import asyncio
			
 
				+import requests
			
 
				+import httpx
			
 
				+from PIL import Image
			
 
				+from dotenv import load_dotenv
			
 
				+from io import BytesIO
			
 
				+from pathlib import Path
			
 
				+from groq import Groq
			
 
				+load_dotenv()
			
 
				+
			
 
				+TOGETHER_API_KEY = os.getenv("TOGETHER_API_KEY")
			
 
				+LLAMA_API_KEY = os.getenv("LLAMA_API_KEY")
			
 
				+#LLAMA_API_URL = os.getenv("API_URL")
			
 
				+GROQ_API_KEY = os.getenv("GROQ_API_KEY")
			
 
				+META_ACCESS_TOKEN = os.getenv("META_ACCESS_TOKEN")
			
 
				+PHONE_NUMBER_ID = os.getenv("PHONE_NUMBER_ID")
			
 
				+WHATSAPP_API_URL = os.getenv("WHATSAPP_API_URL")
			
 
				+
			
 
				+def text_to_speech(text: str, output_path: str = "reply.mp3") -> str:
			
 
				+    """
			
 
				+    Synthesizes a given text into an audio file using Groq's TTS service.
			
 
				+
			
 
				+    Args:
			
 
				+        text (str): The text to be synthesized.
			
 
				+        output_path (str): The path where the output audio file will be saved. Defaults to "reply.mp3".
			
 
				+
			
 
				+    Returns:
			
 
				+        str: The path to the output audio file, or None if the synthesis failed.
			
 
				+    """
			
 
				+    try:
			
 
				+        client = Groq(api_key=GROQ_API_KEY)
			
 
				+        response = client.audio.speech.create(
			
 
				+            model="playai-tts",
			
 
				+            voice="Aaliyah-PlayAI",
			
 
				+            response_format="mp3",
			
 
				+            input=text
			
 
				+        )
			
 
				+        
			
 
				+        # Convert string path to Path object and stream the response to a file
			
 
				+        path_obj = Path(output_path)
			
 
				+        response.write_to_file(path_obj)
			
 
				+        return str(path_obj)
			
 
				+    except Exception as e:
			
 
				+        print(f"TTS failed: {e}")
			
 
				+        return None
			
 
				+
			
 
				+
			
 
				+def speech_to_text(input_path: str) -> str:
			
 
				+    """
			
 
				+    Transcribe an audio file using Groq.
			
 
				+
			
 
				+    Args:
			
 
				+        input_path (str): Path to the audio file to be transcribed.
			
 
				+        output_path (str, optional): Path to the output file where the transcription will be saved. Defaults to "transcription.txt".
			
 
				+
			
 
				+    Returns:
			
 
				+        str: The transcribed text.
			
 
				+    """
			
 
				+
			
 
				+    client = Groq(api_key=GROQ_API_KEY)
			
 
				+    with open(input_path, "rb") as file:
			
 
				+        transcription = client.audio.transcriptions.create(
			
 
				+            model="distil-whisper-large-v3-en",
			
 
				+            response_format="verbose_json",
			
 
				+            file=(input_path, file.read())
			
 
				+        )
			
 
				+        transcription.text
			
 
				+
			
 
				+    return transcription.text
			
 
				+      
			
 
				+
			
 
				+
			
 
				+
			
 
				+
			
 
				+def get_llm_response(text_input: str, image_input : str = None) -> str:
			
 
				+    """
			
 
				+    Get the response from the Together AI LLM given a text input and an optional image input.
			
 
				+
			
 
				+    Args:
			
 
				+        text_input (str): The text to be sent to the LLM.
			
 
				+        image_input (str, optional): The base64 encoded image to be sent to the LLM. Defaults to None.
			
 
				+
			
 
				+    Returns:
			
 
				+        str: The response from the LLM.
			
 
				+    """
			
 
				+    messages = []
			
 
				+    # print(bool(image_input))
			
 
				+    if image_input:
			
 
				+        messages.append({
			
 
				+            "type": "image_url",
			
 
				+            "image_url": {"url": f"data:image/jpeg;base64,{image_input}"}
			
 
				+        })
			
 
				+    messages.append({
			
 
				+        "type": "text",
			
 
				+        "text": text_input
			
 
				+    })
			
 
				+    try:
			
 
				+        #client = Together(api_key=TOGETHER_API_KEY)
			
 
				+        client = OpenAI(base_url= "https://api.llama.com/compat/v1/")
			
 
				+        completion = client.chat.completions.create(
			
 
				+            model="Llama-4-Maverick-17B-128E-Instruct-FP8",
			
 
				+            messages=[
			
 
				+                {
			
 
				+                    "role": "user",
			
 
				+                    "content": messages
			
 
				+                }
			
 
				+            ]
			
 
				+        )
			
 
				+        
			
 
				+        if completion.choices and len(completion.choices) > 0:
			
 
				+            return completion.choices[0].message.content
			
 
				+        else:
			
 
				+            print("Empty response from Together API")
			
 
				+            return None
			
 
				+    except Exception as e:
			
 
				+        print(f"LLM error: {e}")
			
 
				+        return None
			
 
				+
			
 
				+
			
 
				+
			
 
				+
			
 
				+
			
 
				+
			
 
				+
			
 
				+async def fetch_media(media_id: str) -> str:
			
 
				+    """
			
 
				+    Fetches the URL of a media given its ID.
			
 
				+
			
 
				+    Args:
			
 
				+        media_id (str): The ID of the media to fetch.
			
 
				+
			
 
				+    Returns:
			
 
				+        str: The URL of the media.
			
 
				+    """
			
 
				+    url = "https://graph.facebook.com/v22.0/{media_id}"
			
 
				+    async with httpx.AsyncClient() as client:
			
 
				+        try:
			
 
				+            response = await client.get(
			
 
				+                url.format(media_id=media_id),
			
 
				+                headers={"Authorization": f"Bearer {META_ACCESS_TOKEN}"}
			
 
				+            )
			
 
				+            if response.status_code == 200:
			
 
				+                return response.json().get("url")
			
 
				+            else:
			
 
				+                print(f"Failed to fetch media: {response.text}")
			
 
				+        except Exception as e:
			
 
				+            print(f"Exception during media fetch: {e}")
			
 
				+    return None
			
 
				+
			
 
				+async def handle_image_message(media_id: str) -> str:
			
 
				+    """
			
 
				+    Handle an image message by fetching the image media, converting it to base64,
			
 
				+    and returning the base64 string.
			
 
				+
			
 
				+    Args:
			
 
				+        media_id (str): The ID of the image media to fetch.
			
 
				+
			
 
				+    Returns:
			
 
				+        str: The base64 string of the image.
			
 
				+    """
			
 
				+    media_url = await fetch_media(media_id)
			
 
				+    # print(media_url)
			
 
				+    async with httpx.AsyncClient() as client:
			
 
				+        headers = {"Authorization": f"Bearer {META_ACCESS_TOKEN}"}
			
 
				+        response = await client.get(media_url, headers=headers)
			
 
				+        response.raise_for_status()
			
 
				+
			
 
				+        # Convert image to base64
			
 
				+        image = Image.open(BytesIO(response.content))
			
 
				+        buffered = BytesIO()
			
 
				+        image.save(buffered, format="JPEG")  # Save as JPEG
			
 
				+        # image.save("./test.jpeg", format="JPEG")  # Optional save
			
 
				+        base64_image = base64.b64encode(buffered.getvalue()).decode("utf-8")
			
 
				+        
			
 
				+        return base64_image
			
 
				+
			
 
				+async def handle_audio_message(media_id: str):
			
 
				+    """
			
 
				+    Handle an audio message by fetching the audio media, writing it to a temporary file,
			
 
				+    and then using Groq to transcribe the audio to text.
			
 
				+
			
 
				+    Args:
			
 
				+        media_id (str): The ID of the audio media to fetch.
			
 
				+
			
 
				+    Returns:
			
 
				+        str: The transcribed text.
			
 
				+    """
			
 
				+    media_url = await fetch_media(media_id)
			
 
				+    # print(media_url)
			
 
				+    async with httpx.AsyncClient() as client:
			
 
				+        headers = {"Authorization": f"Bearer {META_ACCESS_TOKEN}"}
			
 
				+        response = await client.get(media_url, headers=headers)
			
 
				+
			
 
				+        response.raise_for_status()
			
 
				+        audio_bytes = response.content
			
 
				+        temp_audio_path = "temp_audio.m4a"
			
 
				+        with open(temp_audio_path, "wb") as f:
			
 
				+            f.write(audio_bytes)
			
 
				+        return speech_to_text(temp_audio_path)
			
 
				+
			
 
				+async def send_audio_message(to: str, file_path: str):
			
 
				+    """
			
 
				+    Send an audio message to a WhatsApp user.
			
 
				+
			
 
				+    Args:
			
 
				+        to (str): The phone number of the recipient.
			
 
				+        file_path (str): The path to the audio file to be sent.
			
 
				+
			
 
				+    Returns:
			
 
				+        None
			
 
				+
			
 
				+    Raises:
			
 
				+        None
			
 
				+    """
			
 
				+    url = f"https://graph.facebook.com/v20.0/{PHONE_NUMBER_ID}/media"
			
 
				+    with open(file_path, "rb") as f:
			
 
				+        files = { "file": ("reply.mp3", open(file_path, "rb"), "audio/mpeg")}
			
 
				+        params = {
			
 
				+            "messaging_product": "whatsapp",
			
 
				+            "type": "audio",
			
 
				+            "access_token": META_ACCESS_TOKEN
			
 
				+        }
			
 
				+        response = requests.post(url, params=params, files=files)
			
 
				+
			
 
				+    if response.status_code == 200:
			
 
				+        media_id = response.json().get("id")
			
 
				+        payload = {
			
 
				+            "messaging_product": "whatsapp",
			
 
				+            "to": to,
			
 
				+            "type": "audio",
			
 
				+            "audio": {"id": media_id}
			
 
				+        }
			
 
				+        headers = {
			
 
				+            "Authorization": f"Bearer {META_ACCESS_TOKEN}",
			
 
				+            "Content-Type": "application/json"
			
 
				+        }
			
 
				+        requests.post(WHATSAPP_API_URL, headers=headers, json=payload)
			
 
				+    else:
			
 
				+        print("Audio upload failed:", response.text)
			
--- a/end-to-end-use-cases/whatsapp_llama_4_bot/requirements.txt
+++ b/end-to-end-use-cases/whatsapp_llama_4_bot/requirements.txt
@@ -0,0 +1,48 @@
 
				+aiohappyeyeballs==2.6.1
			
 
				+aiohttp==3.11.16
			
 
				+aiosignal==1.3.2
			
 
				+annotated-types==0.7.0
			
 
				+anyio==4.9.0
			
 
				+async-timeout==5.0.1
			
 
				+attrs==25.3.0
			
 
				+certifi==2025.1.31
			
 
				+charset-normalizer==3.4.1
			
 
				+click==8.1.8
			
 
				+colorama==0.4.6
			
 
				+distro==1.9.0
			
 
				+dotenv==0.9.9
			
 
				+eval_type_backport==0.2.2
			
 
				+exceptiongroup==1.2.2
			
 
				+fastapi==0.115.12
			
 
				+filelock==3.18.0
			
 
				+frozenlist==1.5.0
			
 
				+groq==0.22.0
			
 
				+h11==0.14.0
			
 
				+httpcore==1.0.8
			
 
				+httpx==0.28.1
			
 
				+idna==3.10
			
 
				+markdown-it-py==3.0.0
			
 
				+mdurl==0.1.2
			
 
				+multidict==6.4.3
			
 
				+numpy==2.2.4
			
 
				+pillow==11.2.1
			
 
				+propcache==0.3.1
			
 
				+pyarrow==19.0.1
			
 
				+pydantic==2.11.3
			
 
				+pydantic_core==2.33.1
			
 
				+Pygments==2.19.1
			
 
				+python-dotenv==1.1.0
			
 
				+requests==2.32.3
			
 
				+rich==13.9.4
			
 
				+shellingham==1.5.4
			
 
				+sniffio==1.3.1
			
 
				+starlette==0.46.2
			
 
				+tabulate==0.9.0
			
 
				+together==1.5.5
			
 
				+tqdm==4.67.1
			
 
				+typer==0.15.2
			
 
				+typing-inspection==0.4.0
			
 
				+typing_extensions==4.13.2
			
 
				+urllib3==2.4.0
			
 
				+uvicorn==0.34.1
			
 
				+yarl==1.19.0
			
--- a/end-to-end-use-cases/whatsapp_llama_4_bot/webhook_main.py
+++ b/end-to-end-use-cases/whatsapp_llama_4_bot/webhook_main.py
@@ -0,0 +1,70 @@
 
				+from fastapi import FastAPI, Request, BackgroundTasks
			
 
				+from fastapi.responses import JSONResponse
			
 
				+from pydantic import BaseModel
			
 
				+from utils import send_message,llm_reply_to_text,handle_image_message,get_llm_response,send_audio_message,fetch_media,text_to_speech,llm_reply_to_text_v2,audio_conversion
			
 
				+import os
			
 
				+import requests
			
 
				+import httpx
			
 
				+from dotenv import load_dotenv
			
 
				+#from utils import handle_image_message
			
 
				+
			
 
				+load_dotenv()
			
 
				+app = FastAPI()
			
 
				+
			
 
				+ACCESS_TOKEN = os.getenv("ACCESS_TOKEN")
			
 
				+AGENT_URL = os.getenv("AGENT_URL")
			
 
				+GROQ_API_KEY = os.getenv("GROQ_API_KEY")
			
 
				+class WhatsAppMessage(BaseModel):
			
 
				+    object: str
			
 
				+    entry: list
			
 
				+
			
 
				+
			
 
				+# @app.get("/webhook")
			
 
				+# async def verify_webhook(request: Request):
			
 
				+#     mode = request.query_params.get("hub.mode")
			
 
				+#     token = request.query_params.get("hub.verify_token")
			
 
				+#     challenge = request.query_params.get("hub.challenge")
			
 
				+#     print(mode)
			
 
				+#     print(token)
			
 
				+#     print(challenge)
			
 
				+
			
 
				+#     # if mode and token and mode == "subscribe" and token == "1234":
			
 
				+#     #     return {"hub_verfiy_mode":mode,"hub_verify_token":token, "hub_verify_challange":challenge }
			
 
				+#     # return token
			
 
				+
			
 
				+#     return int(challenge)
			
 
				+#     # return {"error": "Invalid verification token"}
			
 
				+
			
 
				+
			
 
				+
			
 
				+
			
 
				+
			
 
				+@app.post("/webhook")
			
 
				+async def webhook_handler(request: Request, background_tasks: BackgroundTasks):
			
 
				+    data = await request.json()
			
 
				+    message_data = WhatsAppMessage(**data)
			
 
				+    
			
 
				+    change = message_data.entry[0]["changes"][0]["value"]
			
 
				+    print(change)
			
 
				+    if 'messages' in change:
			
 
				+        message = change["messages"][-1]
			
 
				+        user_phone = message["from"]
			
 
				+        print(message)
			
 
				+        if "text" in message:
			
 
				+            user_message = message["text"]["body"].lower()
			
 
				+            print(user_message)
			
 
				+            background_tasks.add_task(llm_reply_to_text_v2, user_message, user_phone,None,None)
			
 
				+        elif "image" in message:
			
 
				+            media_id = message["image"]["id"]
			
 
				+            print(media_id)
			
 
				+            caption = message["image"].get("caption", "")
			
 
				+            # background_tasks.add_task(handle_image_message, media_id, user_phone, caption)
			
 
				+            background_tasks.add_task(llm_reply_to_text_v2,caption,user_phone,media_id,'image')
			
 
				+        elif message.get("audio"):
			
 
				+            media_id = message["audio"]["id"]
			
 
				+            print(media_id)
			
 
				+            path = await audio_conversion("",media_id,'audio')
			
 
				+            # Send final audio reply
			
 
				+            print(user_phone)
			
 
				+            await send_audio_message(user_phone, path)
			
 
				+        return JSONResponse(content={"status": "ok"}), 200
			
--- a/end-to-end-use-cases/whatsapp_llama_4_bot/webhook_utils.py
+++ b/end-to-end-use-cases/whatsapp_llama_4_bot/webhook_utils.py
@@ -0,0 +1,116 @@
 
				+import os
			
 
				+import base64
			
 
				+import asyncio
			
 
				+import requests
			
 
				+import httpx
			
 
				+from PIL import Image
			
 
				+from dotenv import load_dotenv
			
 
				+from io import BytesIO
			
 
				+
			
 
				+load_dotenv()
			
 
				+
			
 
				+META_ACCESS_TOKEN = os.getenv("META_ACCESS_TOKEN")
			
 
				+WHATSAPP_API_URL = os.getenv("WHATSAPP_API_URL")
			
 
				+TOGETHER_API_KEY = os.getenv("TOGETHER_API_KEY")
			
 
				+MEDIA_URL = "https://graph.facebook.com/v20.0/{media_id}"
			
 
				+BASE_URL = os.getenv("BASE_URL")
			
 
				+PHONE_NUMBER_ID = os.getenv("PHONE_NUMBER_ID")
			
 
				+GROQ_API_KEY = os.getenv("GROQ_API_KEY")
			
 
				+
			
 
				+def send_message(to: str, text: str):
			
 
				+    if not text:
			
 
				+        print("Error: Message text is empty.")
			
 
				+        return
			
 
				+
			
 
				+    payload = {
			
 
				+        "messaging_product": "whatsapp",
			
 
				+        "to": to,
			
 
				+        "type": "text",
			
 
				+        "text": {"body": text}
			
 
				+    }
			
 
				+
			
 
				+    headers = {
			
 
				+        "Authorization": f"Bearer {META_ACCESS_TOKEN}",
			
 
				+        "Content-Type": "application/json"
			
 
				+    }
			
 
				+
			
 
				+    response = requests.post(WHATSAPP_API_URL, headers=headers, json=payload)
			
 
				+    if response.status_code == 200:
			
 
				+        print("Message sent")
			
 
				+    else:
			
 
				+        print(f"Send failed: {response.text}")
			
 
				+
			
 
				+
			
 
				+
			
 
				+async def send_message_async(user_phone: str, message: str):
			
 
				+    loop = asyncio.get_running_loop()
			
 
				+    await loop.run_in_executor(None, send_message, user_phone, message)
			
 
				+
			
 
				+
			
 
				+
			
 
				+        
			
 
				+async def send_audio_message(to: str, file_path: str):
			
 
				+    url = f"https://graph.facebook.com/v20.0/{PHONE_NUMBER_ID}/media"
			
 
				+    with open(file_path, "rb") as f:
			
 
				+        files = { "file": ("reply.mp3", open(file_path, "rb"), "audio/mpeg")}
			
 
				+        params = {
			
 
				+            "messaging_product": "whatsapp",
			
 
				+            "type": "audio",
			
 
				+            "access_token": ACCESS_TOKEN
			
 
				+        }
			
 
				+        response = requests.post(url, params=params, files=files)
			
 
				+
			
 
				+    if response.status_code == 200:
			
 
				+        media_id = response.json().get("id")
			
 
				+        payload = {
			
 
				+            "messaging_product": "whatsapp",
			
 
				+            "to": to,
			
 
				+            "type": "audio",
			
 
				+            "audio": {"id": media_id}
			
 
				+        }
			
 
				+        headers = {
			
 
				+            "Authorization": f"Bearer {ACCESS_TOKEN}",
			
 
				+            "Content-Type": "application/json"
			
 
				+        }
			
 
				+        requests.post(WHATSAPP_API_URL, headers=headers, json=payload)
			
 
				+    else:
			
 
				+        print("Audio upload failed:", response.text)
			
 
				+
			
 
				+
			
 
				+
			
 
				+
			
 
				+
			
 
				+
			
 
				+async def llm_reply_to_text_v2(user_input: str, user_phone: str, media_id: str = None,kind: str = None):
			
 
				+    try:
			
 
				+        # print("inside this function")
			
 
				+        headers = {
			
 
				+        'accept': 'application/json',
			
 
				+        'Content-Type': 'application/json',
			
 
				+    }
			
 
				+
			
 
				+        json_data = {
			
 
				+            'user_input': user_input,
			
 
				+            'media_id': media_id,
			
 
				+            'kind': kind
			
 
				+        }
			
 
				+        
			
 
				+        async with httpx.AsyncClient() as client:
			
 
				+          response = await client.post("https://df00-171-60-176-142.ngrok-free.app/llm-response", json=json_data, headers=headers,timeout=60)
			
 
				+          response_data = response.json()
			
 
				+          # print(response_data)
			
 
				+          if response.status_code == 200 and response_data['error'] == None:
			
 
				+              message_content = response_data['response']
			
 
				+              if message_content:
			
 
				+                  loop = asyncio.get_running_loop()
			
 
				+                  await loop.run_in_executor(None, send_message, user_phone, message_content)
			
 
				+              else:
			
 
				+                  print("Error: Empty message content from LLM API")
			
 
				+                  await send_message_async(user_phone, "Received empty response from LLM API.")
			
 
				+          else:
			
 
				+              print("Error: Invalid LLM API response", response_data)
			
 
				+              await send_message_async(user_phone, "Failed to process image due to an internal server error.")
			
 
				+
			
 
				+    except Exception as e:
			
 
				+        print("LLM error:", e)
			
 
				+        await send_message_async(user_phone, "Sorry, something went wrong while generating a response.")
			
--- a/src/docs/img/WhatApp_Llama4_integration.jpeg
+++ b/src/docs/img/WhatApp_Llama4_integration.jpeg