1 년 전 · 323783f944
--- a/end-to-end-use-cases/README.md
+++ b/end-to-end-use-cases/README.md
@@ -15,6 +15,50 @@
 
																 	<a href="https://github.com/meta-llama/llama-prompt-ops"><img alt="Llama Tools Syntethic Data Kit" src="https://img.shields.io/badge/Llama_Tools-llama--prompt--ops-orange?logo=meta" /></a>
															
 
																 </p>
															
 
																+
															
 
																+
															
 
																+
															
 
																+## [Building an Intelligent WhatsApp Bot with Llama 4 APIs](./whatsapp-llama4-bot/README.md)
															
 
																+### A Step-by-Step Guide
															
 
																+
															
 
																+Create a WhatsApp bot that leverages the power of Llama 4 APIs to provide intelligent and interactive responses. This guide will walk you through the process of building a bot that supports text, image, and audio interactions, making it versatile for various use cases.
															
 
																+
															
 
																+- **Text Interaction**: Respond to text messages with accurate and contextually relevant answers.
															
 
																+- **Image Reasoning**: Analyze images to provide insights, descriptions, or answers related to the content.
															
 
																+- **Audio-to-Audio Interaction**: Transcribe audio messages to text, process them, and convert back to audio for seamless voice-based interaction.
															
 
																+
															
 
																+Get started with building your own WhatsApp bot using Llama 4 APIs today!
															
 
																+
															
 
																+
															
 
																+
															
 
																+
															
 
																+## [Research Paper Analyzer with Llama4 Maverick](./research_paper_analyzer/README.md)
															
 
																+### Analyze Research Papers with Ease
															
 
																+
															
 
																+Leverage Llama4 Maverick to retrieve references from an arXiv paper and ingest all their content for question-answering.
															
 
																+
															
 
																+- **Long Context Length**: Process entire papers at once.
															
 
																+- **Comprehensive Analysis**: Get insights, descriptions, or answers related to the content.
															
 
																+
															
 
																+
															
 
																+Get started with analyzing research papers using Llama4 Maverick today!
															
 
																+
															
 
																+
															
 
																+
															
 
																+
															
 
																+## [Book Character Mind Map With Llama4 Maverick](./book_character_mindmap/README.md)
															
 
																+### Explore Book Characters and Storylines
															
 
																+
															
 
																+Use Llama4 Maverick to process entire books at once and visualize character relationships and storylines.
															
 
																+
															
 
																+- **Interactive Mind Maps**: Visualize relationships between characters and plot elements.
															
 
																+- **Book Summaries**: Get concise overviews of plots and themes.
															
 
																+
															
 
																+Discover new insights into your favorite books!
															
 
																+
															
 
																+
															
 
																+
															
 
																+
															
 
																 ## [Agentic Tutorial](./agents/):
															
 
																 ### 101 and 201 tutorials on performing Tool Calling and building an Agentic Workflow using Llama Models
															
@@ -50,10 +94,15 @@ Workflow showcasing how to use multiple Llama models to go from any PDF to a Pod
 
																 ### Building a Llama 3 Enabled WhatsApp Chatbot
															
 
																 This step-by-step tutorial shows how to use the [WhatsApp Business API](https://developers.facebook.com/docs/whatsapp/cloud-api/overview) to build a Llama 3 enabled WhatsApp chatbot.
															
 
																+
															
 
																 ## [Messenger Chatbot](./customerservice_chatbots/messenger_chatbot/messenger_llama3.md):
															
 
																 ### Building a Llama 3 Enabled Messenger Chatbot
															
 
																 This step-by-step tutorial shows how to use the [Messenger Platform](https://developers.facebook.com/docs/messenger-platform/overview) to build a Llama 3 enabled Messenger chatbot.
															
 
																+
															
 
																 ### RAG Chatbot Example (running [locally](./customerservice_chatbots/RAG_chatbot/RAG_Chatbot_Example.ipynb)
															
 
																 A complete example of how to build a Llama 3 chatbot hosted on your browser that can answer questions based on your own data using retrieval augmented generation (RAG).
															
 
																+
															
 
																+
															
 
																+
															
--- a/end-to-end-use-cases/whatsapp_llama_4_bot/.env
+++ b/end-to-end-use-cases/whatsapp_llama_4_bot/.env
@@ -0,0 +1,18 @@
 
																+# WhatsApp Business Phone Number ID (NOT the phone number itself)
															
 
																+PHONE_NUMBER_ID="place your whatsapp phone number id"
															
 
																+
															
 
																+# Full URL to send WhatsApp messages (use correct version and phone number ID)
															
 
																+WHATSAPP_API_URL="place graphql request i.e. https://graph.facebook.com/v{version}/{phone_number_id}/messages"
															
 
																+
															
 
																+# Your custom backend/agent endpoint (e.g., for LLM-based processing)
															
 
																+AGENT_URL=https://your-agent-url.com/api
															
 
																+
															
 
																+LLAMA_API_KEY="place your LLAMA API Key"
															
 
																+
															
 
																+TOGETHER_API_KEY="place your Together API Key, In case you want to use Together, instead of Llama APIs"
															
 
																+
															
 
																+GROQ_API_KEY="place your Groq API Key - this is for SST and TTS"
															
 
																+
															
 
																+OPENAI_API_KEY="place your OpenAI Ke to run the client"
															
 
																+
															
 
																+META_ACCESS_TOKEN="please your WhatsApp generated Access token from the app"
															
--- a/end-to-end-use-cases/whatsapp_llama_4_bot/README.md
+++ b/end-to-end-use-cases/whatsapp_llama_4_bot/README.md
@@ -0,0 +1,117 @@
 
																+# WhatsApp and Llama 4 APIs : Build your own multi-modal chatbot
															
 
																+
															
 
																+Welcome to the WhatsApp Llama4 Bot ! This bot leverages the power of the Llama 4 APIs to provide intelligent and interactive responses to users via WhatsApp. It supports text, image, and audio interactions, making it a versatile tool for various use cases.
															
 
																+
															
 
																+
															
 
																+## Key Features
															
 
																+- **Text Interaction**: Users can send text messages to the bot, which are processed using the Llama4 APIs to generate accurate and contextually relevant responses.
															
 
																+- **Image Reasoning**: The bot can analyze images sent by users, providing insights, descriptions, or answers related to the image content.
															
 
																+- **Audio-to-Audio Interaction**: Users can send audio messages, which are transcribed to text, processed by the Llama4, and converted back to audio for a seamless voice-based interaction.
															
 
																+
															
 
																+
															
 
																+
															
 
																+## Technical Overview
															
 
																+
															
 
																+### Architecture
															
 
																+
															
 
																+- **FastAPI**: The bot is built using FastAPI, a modern web framework for building APIs with Python.
															
 
																+- **Asynchronous Processing**: Utilizes `httpx` for making asynchronous HTTP requests to external APIs, ensuring efficient handling of media files.
															
 
																+- **Environment Configuration**: Uses `dotenv` to manage environment variables, keeping sensitive information like API keys secure.
															
 
																+
															
 
																+Please refer below a high-level of architecture which explains the integrations :
															
 
																+![WhatsApp Llama4 Integration Diagram](src/docs/img/WhatApp_Llama4_integration.jpeg)
															
 
																+
															
 
																+
															
 
																+
															
 
																+
															
 
																+
															
 
																+### Important Integrations
															
 
																+
															
 
																+- **WhatsApp API**: Facilitates sending and receiving messages, images, and audio files. 
															
 
																+- **Llama4 Model**: Provides advanced natural language processing capabilities for generating responses.
															
 
																+- **Groq API**: Handles speech-to-text (STT) and text-to-speech (TTS) conversions, enabling the audio-to-audio feature.
															
 
																+
															
 
																+
															
 
																+
															
 
																+
															
 
																+
															
 
																+## Here are the steps to setup with WhatsApp Business Cloud API
															
 
																+
															
 
																+
															
 
																+First, open the [WhatsApp Business Platform Cloud API Get Started Guide](https://developers.facebook.com/docs/whatsapp/cloud-api/get-started#set-up-developer-assets) and follow the first four steps to:
															
 
																+
															
 
																+1. Add the WhatsApp product to your business app;
															
 
																+2. Add a recipient number;
															
 
																+3. Send a test message;
															
 
																+4. Configure a webhook to receive real time HTTP notifications.
															
 
																+
															
 
																+For the last step, you need to further follow the [Sample Callback URL for Webhooks Testing Guide](https://developers.facebook.com/docs/whatsapp/sample-app-endpoints) to create a free account on glitch.com to get your webhook's callback URL.
															
 
																+
															
 
																+Now open the [Meta for Develops Apps](https://developers.facebook.com/apps/) page and select the WhatsApp business app and you should be able to copy the curl command (as shown in the App Dashboard - WhatsApp - API Setup - Step 2 below) and run the command on a Terminal to send a test message to your WhatsApp.
															
 
																+
															
 
																+![](../../../src/docs/img/whatsapp_dashboard.jpg)
															
 
																+
															
 
																+Note down the "Temporary access token", "Phone number ID", and "a recipient phone number" in the API Setup page above, which will be used later.
															
 
																+
															
 
																+
															
 
																+
															
 
																+
															
 
																+
															
 
																+## Setup and Installation
															
 
																+
															
 
																+
															
 
																+
															
 
																+### Step 1: Clone the Repository
															
 
																+
															
 
																+```bash
															
 
																+git clone https://github.com/meta-llama/internal-llama-cookbook.git
															
 
																+cd internal-llama-cookbook/end-to-end-use-cases/whatsapp-llama4-bot
															
 
																+```
															
 
																+
															
 
																+### Step 2: Install Dependencies
															
 
																+
															
 
																+Ensure you have Python installed, then run the following command to install the required packages:
															
 
																+
															
 
																+```bash
															
 
																+pip install -r requirements.txt
															
 
																+```
															
 
																+
															
 
																+
															
 
																+
															
 
																+### Step 3: Configure Environment Variables
															
 
																+
															
 
																+Create a `.env` file in the project directory and add your API keys and other configuration details as follows:
															
 
																+
															
 
																+```plaintext
															
 
																+ACCESS_TOKEN=your_whatsapp_access_token
															
 
																+WHATSAPP_API_URL=your_whatsapp_api_url
															
 
																+TOGETHER_API_KEY=your_llama4_api_key
															
 
																+GROQ_API_KEY=your_groq_api_key
															
 
																+PHONE_NUMBER_ID=your_phone_number_id'
															
 
																+```
															
 
																+
															
 
																+
															
 
																+
															
 
																+### Step 4: Run the Application
															
 
																+
															
 
																+On your EC2 instance, run the following command on a Terminal to start the FastAPI server 
															
 
																+
															
 
																+```bash
															
 
																+uvicorn ec2_endpoints:app —host 0.0.0.0 —port 5000
															
 
																+```
															
 
																+
															
 
																+Note: If you use Amazon EC2 as your web server, make sure you have port 5000 added to your EC2 instance's security group's inbound rules.
															
 
																+
															
 
																+
															
 
																+
															
 
																+
															
 
																+## License
															
 
																+
															
 
																+This project is licensed under the MIT License.
															
 
																+
															
 
																+
															
 
																+## Contributing
															
 
																+
															
 
																+We welcome contributions to enhance the capabilities of this bot. Please feel free to submit issues or pull requests.
															
 
																+
															
 
																+
															
--- a/end-to-end-use-cases/whatsapp_llama_4_bot/ec2_endpoints.py
+++ b/end-to-end-use-cases/whatsapp_llama_4_bot/ec2_endpoints.py
@@ -0,0 +1,49 @@
 
																+from fastapi import FastAPI, HTTPException 
															
 
																+from fastapi.responses import FileResponse
															
 
																+from pydantic import BaseModel
															
 
																+from typing import Optional
															
 
																+from service import text_to_speech, get_llm_response, handle_image_message,handle_audio_message,send_audio_message
															
 
																+from enum import Enum
															
 
																+app = FastAPI()
															
 
																+
															
 
																+class TextToSpeechRequest(BaseModel):
															
 
																+    text: str
															
 
																+    output_path: Optional[str] = "reply.mp3"
															
 
																+
															
 
																+class TextToSpeechResponse(BaseModel):
															
 
																+    file_path: Optional[str]
															
 
																+    error: Optional[str] = None
															
 
																+
															
 
																+class KindEnum(str, Enum):
															
 
																+    audio = "audio"
															
 
																+    image = "image"
															
 
																+
															
 
																+class LLMRequest(BaseModel):
															
 
																+    user_input: str
															
 
																+    media_id: Optional[str] = None
															
 
																+    kind: Optional[KindEnum] = None
															
 
																+
															
 
																+
															
 
																+class LLMResponse(BaseModel):
															
 
																+    response: Optional[str]
															
 
																+    error: Optional[str] = None
															
 
																+
															
 
																+@app.post("/llm-response", response_model=LLMResponse)
															
 
																+async def api_llm_response(req: LLMRequest):
															
 
																+    text_message = req.user_input
															
 
																+    image_base64 = None
															
 
																+    if req.kind == KindEnum.image:
															
 
																+        image_base64 = await handle_image_message(req.media_id)
															
 
																+        result = get_llm_response(text_message, image_input=image_base64)
															
 
																+        # print(result)
															
 
																+    elif req.kind == KindEnum.audio:
															
 
																+        text_message = await handle_audio_message(req.media_id)
															
 
																+        result = get_llm_response(text_message)
															
 
																+        audio_path = text_to_speech(text=result, output_path="reply.mp3")
															
 
																+        return FileResponse(audio_path, media_type="audio/mpeg", filename="reply.mp3")
															
 
																+    else:
															
 
																+        result = get_llm_response(text_message)
															
 
																+    
															
 
																+    if result is None:
															
 
																+        return LLMResponse(response=None, error="LLM response generation failed.")
															
 
																+    return LLMResponse(response=result)
															
--- a/end-to-end-use-cases/whatsapp_llama_4_bot/ec2_services.py
+++ b/end-to-end-use-cases/whatsapp_llama_4_bot/ec2_services.py
@@ -0,0 +1,243 @@
 
																+from together import Together
															
 
																+from openai import OpenAI 
															
 
																+import os
															
 
																+import base64
															
 
																+import asyncio
															
 
																+import requests
															
 
																+import httpx
															
 
																+from PIL import Image
															
 
																+from dotenv import load_dotenv
															
 
																+from io import BytesIO
															
 
																+from pathlib import Path
															
 
																+from groq import Groq
															
 
																+load_dotenv()
															
 
																+
															
 
																+TOGETHER_API_KEY = os.getenv("TOGETHER_API_KEY")
															
 
																+LLAMA_API_KEY = os.getenv("LLAMA_API_KEY")
															
 
																+#LLAMA_API_URL = os.getenv("API_URL")
															
 
																+GROQ_API_KEY = os.getenv("GROQ_API_KEY")
															
 
																+META_ACCESS_TOKEN = os.getenv("META_ACCESS_TOKEN")
															
 
																+PHONE_NUMBER_ID = os.getenv("PHONE_NUMBER_ID")
															
 
																+WHATSAPP_API_URL = os.getenv("WHATSAPP_API_URL")
															
 
																+
															
 
																+def text_to_speech(text: str, output_path: str = "reply.mp3") -> str:
															
 
																+    """
															
 
																+    Synthesizes a given text into an audio file using Groq's TTS service.
															
 
																+
															
 
																+    Args:
															
 
																+        text (str): The text to be synthesized.
															
 
																+        output_path (str): The path where the output audio file will be saved. Defaults to "reply.mp3".
															
 
																+
															
 
																+    Returns:
															
 
																+        str: The path to the output audio file, or None if the synthesis failed.
															
 
																+    """
															
 
																+    try:
															
 
																+        client = Groq(api_key=GROQ_API_KEY)
															
 
																+        response = client.audio.speech.create(
															
 
																+            model="playai-tts",
															
 
																+            voice="Aaliyah-PlayAI",
															
 
																+            response_format="mp3",
															
 
																+            input=text
															
 
																+        )
															
 
																+        
															
 
																+        # Convert string path to Path object and stream the response to a file
															
 
																+        path_obj = Path(output_path)
															
 
																+        response.write_to_file(path_obj)
															
 
																+        return str(path_obj)
															
 
																+    except Exception as e:
															
 
																+        print(f"TTS failed: {e}")
															
 
																+        return None
															
 
																+
															
 
																+
															
 
																+def speech_to_text(input_path: str) -> str:
															
 
																+    """
															
 
																+    Transcribe an audio file using Groq.
															
 
																+
															
 
																+    Args:
															
 
																+        input_path (str): Path to the audio file to be transcribed.
															
 
																+        output_path (str, optional): Path to the output file where the transcription will be saved. Defaults to "transcription.txt".
															
 
																+
															
 
																+    Returns:
															
 
																+        str: The transcribed text.
															
 
																+    """
															
 
																+
															
 
																+    client = Groq(api_key=GROQ_API_KEY)
															
 
																+    with open(input_path, "rb") as file:
															
 
																+        transcription = client.audio.transcriptions.create(
															
 
																+            model="distil-whisper-large-v3-en",
															
 
																+            response_format="verbose_json",
															
 
																+            file=(input_path, file.read())
															
 
																+        )
															
 
																+        transcription.text
															
 
																+
															
 
																+    return transcription.text
															
 
																+      
															
 
																+
															
 
																+
															
 
																+
															
 
																+
															
 
																+def get_llm_response(text_input: str, image_input : str = None) -> str:
															
 
																+    """
															
 
																+    Get the response from the Together AI LLM given a text input and an optional image input.
															
 
																+
															
 
																+    Args:
															
 
																+        text_input (str): The text to be sent to the LLM.
															
 
																+        image_input (str, optional): The base64 encoded image to be sent to the LLM. Defaults to None.
															
 
																+
															
 
																+    Returns:
															
 
																+        str: The response from the LLM.
															
 
																+    """
															
 
																+    messages = []
															
 
																+    # print(bool(image_input))
															
 
																+    if image_input:
															
 
																+        messages.append({
															
 
																+            "type": "image_url",
															
 
																+            "image_url": {"url": f"data:image/jpeg;base64,{image_input}"}
															
 
																+        })
															
 
																+    messages.append({
															
 
																+        "type": "text",
															
 
																+        "text": text_input
															
 
																+    })
															
 
																+    try:
															
 
																+        #client = Together(api_key=TOGETHER_API_KEY)
															
 
																+        client = OpenAI(base_url= "https://api.llama.com/compat/v1/")
															
 
																+        completion = client.chat.completions.create(
															
 
																+            model="Llama-4-Maverick-17B-128E-Instruct-FP8",
															
 
																+            messages=[
															
 
																+                {
															
 
																+                    "role": "user",
															
 
																+                    "content": messages
															
 
																+                }
															
 
																+            ]
															
 
																+        )
															
 
																+        
															
 
																+        if completion.choices and len(completion.choices) > 0:
															
 
																+            return completion.choices[0].message.content
															
 
																+        else:
															
 
																+            print("Empty response from Together API")
															
 
																+            return None
															
 
																+    except Exception as e:
															
 
																+        print(f"LLM error: {e}")
															
 
																+        return None
															
 
																+
															
 
																+
															
 
																+
															
 
																+
															
 
																+
															
 
																+
															
 
																+
															
 
																+async def fetch_media(media_id: str) -> str:
															
 
																+    """
															
 
																+    Fetches the URL of a media given its ID.
															
 
																+
															
 
																+    Args:
															
 
																+        media_id (str): The ID of the media to fetch.
															
 
																+
															
 
																+    Returns:
															
 
																+        str: The URL of the media.
															
 
																+    """
															
 
																+    url = "https://graph.facebook.com/v22.0/{media_id}"
															
 
																+    async with httpx.AsyncClient() as client:
															
 
																+        try:
															
 
																+            response = await client.get(
															
 
																+                url.format(media_id=media_id),
															
 
																+                headers={"Authorization": f"Bearer {META_ACCESS_TOKEN}"}
															
 
																+            )
															
 
																+            if response.status_code == 200:
															
 
																+                return response.json().get("url")
															
 
																+            else:
															
 
																+                print(f"Failed to fetch media: {response.text}")
															
 
																+        except Exception as e:
															
 
																+            print(f"Exception during media fetch: {e}")
															
 
																+    return None
															
 
																+
															
 
																+async def handle_image_message(media_id: str) -> str:
															
 
																+    """
															
 
																+    Handle an image message by fetching the image media, converting it to base64,
															
 
																+    and returning the base64 string.
															
 
																+
															
 
																+    Args:
															
 
																+        media_id (str): The ID of the image media to fetch.
															
 
																+
															
 
																+    Returns:
															
 
																+        str: The base64 string of the image.
															
 
																+    """
															
 
																+    media_url = await fetch_media(media_id)
															
 
																+    # print(media_url)
															
 
																+    async with httpx.AsyncClient() as client:
															
 
																+        headers = {"Authorization": f"Bearer {META_ACCESS_TOKEN}"}
															
 
																+        response = await client.get(media_url, headers=headers)
															
 
																+        response.raise_for_status()
															
 
																+
															
 
																+        # Convert image to base64
															
 
																+        image = Image.open(BytesIO(response.content))
															
 
																+        buffered = BytesIO()
															
 
																+        image.save(buffered, format="JPEG")  # Save as JPEG
															
 
																+        # image.save("./test.jpeg", format="JPEG")  # Optional save
															
 
																+        base64_image = base64.b64encode(buffered.getvalue()).decode("utf-8")
															
 
																+        
															
 
																+        return base64_image
															
 
																+
															
 
																+async def handle_audio_message(media_id: str):
															
 
																+    """
															
 
																+    Handle an audio message by fetching the audio media, writing it to a temporary file,
															
 
																+    and then using Groq to transcribe the audio to text.
															
 
																+
															
 
																+    Args:
															
 
																+        media_id (str): The ID of the audio media to fetch.
															
 
																+
															
 
																+    Returns:
															
 
																+        str: The transcribed text.
															
 
																+    """
															
 
																+    media_url = await fetch_media(media_id)
															
 
																+    # print(media_url)
															
 
																+    async with httpx.AsyncClient() as client:
															
 
																+        headers = {"Authorization": f"Bearer {META_ACCESS_TOKEN}"}
															
 
																+        response = await client.get(media_url, headers=headers)
															
 
																+
															
 
																+        response.raise_for_status()
															
 
																+        audio_bytes = response.content
															
 
																+        temp_audio_path = "temp_audio.m4a"
															
 
																+        with open(temp_audio_path, "wb") as f:
															
 
																+            f.write(audio_bytes)
															
 
																+        return speech_to_text(temp_audio_path)
															
 
																+
															
 
																+async def send_audio_message(to: str, file_path: str):
															
 
																+    """
															
 
																+    Send an audio message to a WhatsApp user.
															
 
																+
															
 
																+    Args:
															
 
																+        to (str): The phone number of the recipient.
															
 
																+        file_path (str): The path to the audio file to be sent.
															
 
																+
															
 
																+    Returns:
															
 
																+        None
															
 
																+
															
 
																+    Raises:
															
 
																+        None
															
 
																+    """
															
 
																+    url = f"https://graph.facebook.com/v20.0/{PHONE_NUMBER_ID}/media"
															
 
																+    with open(file_path, "rb") as f:
															
 
																+        files = { "file": ("reply.mp3", open(file_path, "rb"), "audio/mpeg")}
															
 
																+        params = {
															
 
																+            "messaging_product": "whatsapp",
															
 
																+            "type": "audio",
															
 
																+            "access_token": META_ACCESS_TOKEN
															
 
																+        }
															
 
																+        response = requests.post(url, params=params, files=files)
															
 
																+
															
 
																+    if response.status_code == 200:
															
 
																+        media_id = response.json().get("id")
															
 
																+        payload = {
															
 
																+            "messaging_product": "whatsapp",
															
 
																+            "to": to,
															
 
																+            "type": "audio",
															
 
																+            "audio": {"id": media_id}
															
 
																+        }
															
 
																+        headers = {
															
 
																+            "Authorization": f"Bearer {META_ACCESS_TOKEN}",
															
 
																+            "Content-Type": "application/json"
															
 
																+        }
															
 
																+        requests.post(WHATSAPP_API_URL, headers=headers, json=payload)
															
 
																+    else:
															
 
																+        print("Audio upload failed:", response.text)
															
--- a/end-to-end-use-cases/whatsapp_llama_4_bot/requirements.txt
+++ b/end-to-end-use-cases/whatsapp_llama_4_bot/requirements.txt
@@ -0,0 +1,48 @@
 
																+aiohappyeyeballs==2.6.1
															
 
																+aiohttp==3.11.16
															
 
																+aiosignal==1.3.2
															
 
																+annotated-types==0.7.0
															
 
																+anyio==4.9.0
															
 
																+async-timeout==5.0.1
															
 
																+attrs==25.3.0
															
 
																+certifi==2025.1.31
															
 
																+charset-normalizer==3.4.1
															
 
																+click==8.1.8
															
 
																+colorama==0.4.6
															
 
																+distro==1.9.0
															
 
																+dotenv==0.9.9
															
 
																+eval_type_backport==0.2.2
															
 
																+exceptiongroup==1.2.2
															
 
																+fastapi==0.115.12
															
 
																+filelock==3.18.0
															
 
																+frozenlist==1.5.0
															
 
																+groq==0.22.0
															
 
																+h11==0.14.0
															
 
																+httpcore==1.0.8
															
 
																+httpx==0.28.1
															
 
																+idna==3.10
															
 
																+markdown-it-py==3.0.0
															
 
																+mdurl==0.1.2
															
 
																+multidict==6.4.3
															
 
																+numpy==2.2.4
															
 
																+pillow==11.2.1
															
 
																+propcache==0.3.1
															
 
																+pyarrow==19.0.1
															
 
																+pydantic==2.11.3
															
 
																+pydantic_core==2.33.1
															
 
																+Pygments==2.19.1
															
 
																+python-dotenv==1.1.0
															
 
																+requests==2.32.3
															
 
																+rich==13.9.4
															
 
																+shellingham==1.5.4
															
 
																+sniffio==1.3.1
															
 
																+starlette==0.46.2
															
 
																+tabulate==0.9.0
															
 
																+together==1.5.5
															
 
																+tqdm==4.67.1
															
 
																+typer==0.15.2
															
 
																+typing-inspection==0.4.0
															
 
																+typing_extensions==4.13.2
															
 
																+urllib3==2.4.0
															
 
																+uvicorn==0.34.1
															
 
																+yarl==1.19.0
															
--- a/end-to-end-use-cases/whatsapp_llama_4_bot/webhook_main.py
+++ b/end-to-end-use-cases/whatsapp_llama_4_bot/webhook_main.py
@@ -0,0 +1,70 @@
 
																+from fastapi import FastAPI, Request, BackgroundTasks
															
 
																+from fastapi.responses import JSONResponse
															
 
																+from pydantic import BaseModel
															
 
																+from utils import send_message,llm_reply_to_text,handle_image_message,get_llm_response,send_audio_message,fetch_media,text_to_speech,llm_reply_to_text_v2,audio_conversion
															
 
																+import os
															
 
																+import requests
															
 
																+import httpx
															
 
																+from dotenv import load_dotenv
															
 
																+#from utils import handle_image_message
															
 
																+
															
 
																+load_dotenv()
															
 
																+app = FastAPI()
															
 
																+
															
 
																+ACCESS_TOKEN = os.getenv("ACCESS_TOKEN")
															
 
																+AGENT_URL = os.getenv("AGENT_URL")
															
 
																+GROQ_API_KEY = os.getenv("GROQ_API_KEY")
															
 
																+class WhatsAppMessage(BaseModel):
															
 
																+    object: str
															
 
																+    entry: list
															
 
																+
															
 
																+
															
 
																+# @app.get("/webhook")
															
 
																+# async def verify_webhook(request: Request):
															
 
																+#     mode = request.query_params.get("hub.mode")
															
 
																+#     token = request.query_params.get("hub.verify_token")
															
 
																+#     challenge = request.query_params.get("hub.challenge")
															
 
																+#     print(mode)
															
 
																+#     print(token)
															
 
																+#     print(challenge)
															
 
																+
															
 
																+#     # if mode and token and mode == "subscribe" and token == "1234":
															
 
																+#     #     return {"hub_verfiy_mode":mode,"hub_verify_token":token, "hub_verify_challange":challenge }
															
 
																+#     # return token
															
 
																+
															
 
																+#     return int(challenge)
															
 
																+#     # return {"error": "Invalid verification token"}
															
 
																+
															
 
																+
															
 
																+
															
 
																+
															
 
																+
															
 
																+@app.post("/webhook")
															
 
																+async def webhook_handler(request: Request, background_tasks: BackgroundTasks):
															
 
																+    data = await request.json()
															
 
																+    message_data = WhatsAppMessage(**data)
															
 
																+    
															
 
																+    change = message_data.entry[0]["changes"][0]["value"]
															
 
																+    print(change)
															
 
																+    if 'messages' in change:
															
 
																+        message = change["messages"][-1]
															
 
																+        user_phone = message["from"]
															
 
																+        print(message)
															
 
																+        if "text" in message:
															
 
																+            user_message = message["text"]["body"].lower()
															
 
																+            print(user_message)
															
 
																+            background_tasks.add_task(llm_reply_to_text_v2, user_message, user_phone,None,None)
															
 
																+        elif "image" in message:
															
 
																+            media_id = message["image"]["id"]
															
 
																+            print(media_id)
															
 
																+            caption = message["image"].get("caption", "")
															
 
																+            # background_tasks.add_task(handle_image_message, media_id, user_phone, caption)
															
 
																+            background_tasks.add_task(llm_reply_to_text_v2,caption,user_phone,media_id,'image')
															
 
																+        elif message.get("audio"):
															
 
																+            media_id = message["audio"]["id"]
															
 
																+            print(media_id)
															
 
																+            path = await audio_conversion("",media_id,'audio')
															
 
																+            # Send final audio reply
															
 
																+            print(user_phone)
															
 
																+            await send_audio_message(user_phone, path)
															
 
																+        return JSONResponse(content={"status": "ok"}), 200
															
--- a/end-to-end-use-cases/whatsapp_llama_4_bot/webhook_utils.py
+++ b/end-to-end-use-cases/whatsapp_llama_4_bot/webhook_utils.py
@@ -0,0 +1,116 @@
 
																+import os
															
 
																+import base64
															
 
																+import asyncio
															
 
																+import requests
															
 
																+import httpx
															
 
																+from PIL import Image
															
 
																+from dotenv import load_dotenv
															
 
																+from io import BytesIO
															
 
																+
															
 
																+load_dotenv()
															
 
																+
															
 
																+META_ACCESS_TOKEN = os.getenv("META_ACCESS_TOKEN")
															
 
																+WHATSAPP_API_URL = os.getenv("WHATSAPP_API_URL")
															
 
																+TOGETHER_API_KEY = os.getenv("TOGETHER_API_KEY")
															
 
																+MEDIA_URL = "https://graph.facebook.com/v20.0/{media_id}"
															
 
																+BASE_URL = os.getenv("BASE_URL")
															
 
																+PHONE_NUMBER_ID = os.getenv("PHONE_NUMBER_ID")
															
 
																+GROQ_API_KEY = os.getenv("GROQ_API_KEY")
															
 
																+
															
 
																+def send_message(to: str, text: str):
															
 
																+    if not text:
															
 
																+        print("Error: Message text is empty.")
															
 
																+        return
															
 
																+
															
 
																+    payload = {
															
 
																+        "messaging_product": "whatsapp",
															
 
																+        "to": to,
															
 
																+        "type": "text",
															
 
																+        "text": {"body": text}
															
 
																+    }
															
 
																+
															
 
																+    headers = {
															
 
																+        "Authorization": f"Bearer {META_ACCESS_TOKEN}",
															
 
																+        "Content-Type": "application/json"
															
 
																+    }
															
 
																+
															
 
																+    response = requests.post(WHATSAPP_API_URL, headers=headers, json=payload)
															
 
																+    if response.status_code == 200:
															
 
																+        print("Message sent")
															
 
																+    else:
															
 
																+        print(f"Send failed: {response.text}")
															
 
																+
															
 
																+
															
 
																+
															
 
																+async def send_message_async(user_phone: str, message: str):
															
 
																+    loop = asyncio.get_running_loop()
															
 
																+    await loop.run_in_executor(None, send_message, user_phone, message)
															
 
																+
															
 
																+
															
 
																+
															
 
																+        
															
 
																+async def send_audio_message(to: str, file_path: str):
															
 
																+    url = f"https://graph.facebook.com/v20.0/{PHONE_NUMBER_ID}/media"
															
 
																+    with open(file_path, "rb") as f:
															
 
																+        files = { "file": ("reply.mp3", open(file_path, "rb"), "audio/mpeg")}
															
 
																+        params = {
															
 
																+            "messaging_product": "whatsapp",
															
 
																+            "type": "audio",
															
 
																+            "access_token": ACCESS_TOKEN
															
 
																+        }
															
 
																+        response = requests.post(url, params=params, files=files)
															
 
																+
															
 
																+    if response.status_code == 200:
															
 
																+        media_id = response.json().get("id")
															
 
																+        payload = {
															
 
																+            "messaging_product": "whatsapp",
															
 
																+            "to": to,
															
 
																+            "type": "audio",
															
 
																+            "audio": {"id": media_id}
															
 
																+        }
															
 
																+        headers = {
															
 
																+            "Authorization": f"Bearer {ACCESS_TOKEN}",
															
 
																+            "Content-Type": "application/json"
															
 
																+        }
															
 
																+        requests.post(WHATSAPP_API_URL, headers=headers, json=payload)
															
 
																+    else:
															
 
																+        print("Audio upload failed:", response.text)
															
 
																+
															
 
																+
															
 
																+
															
 
																+
															
 
																+
															
 
																+
															
 
																+async def llm_reply_to_text_v2(user_input: str, user_phone: str, media_id: str = None,kind: str = None):
															
 
																+    try:
															
 
																+        # print("inside this function")
															
 
																+        headers = {
															
 
																+        'accept': 'application/json',
															
 
																+        'Content-Type': 'application/json',
															
 
																+    }
															
 
																+
															
 
																+        json_data = {
															
 
																+            'user_input': user_input,
															
 
																+            'media_id': media_id,
															
 
																+            'kind': kind
															
 
																+        }
															
 
																+        
															
 
																+        async with httpx.AsyncClient() as client:
															
 
																+          response = await client.post("https://df00-171-60-176-142.ngrok-free.app/llm-response", json=json_data, headers=headers,timeout=60)
															
 
																+          response_data = response.json()
															
 
																+          # print(response_data)
															
 
																+          if response.status_code == 200 and response_data['error'] == None:
															
 
																+              message_content = response_data['response']
															
 
																+              if message_content:
															
 
																+                  loop = asyncio.get_running_loop()
															
 
																+                  await loop.run_in_executor(None, send_message, user_phone, message_content)
															
 
																+              else:
															
 
																+                  print("Error: Empty message content from LLM API")
															
 
																+                  await send_message_async(user_phone, "Received empty response from LLM API.")
															
 
																+          else:
															
 
																+              print("Error: Invalid LLM API response", response_data)
															
 
																+              await send_message_async(user_phone, "Failed to process image due to an internal server error.")
															
 
																+
															
 
																+    except Exception as e:
															
 
																+        print("LLM error:", e)
															
 
																+        await send_message_async(user_phone, "Sorry, something went wrong while generating a response.")
															
--- a/src/docs/img/WhatApp_Llama4_integration.jpeg
+++ b/src/docs/img/WhatApp_Llama4_integration.jpeg