Sfoglia il codice sorgente

Merge PR #34: Rename WhatsApp_Llama4_bot folder and update documentation

- Renamed WhatsApp_Llama4_bot → whatsapp_llama_4_bot for consistency.  
- Updated README.md in end-to-end-use-cases folder.  
- Preserved file history during rename.
Nilesh 1 mese fa
parent
commit
323783f944

+ 49 - 0
end-to-end-use-cases/README.md

@@ -15,6 +15,50 @@
 	<a href="https://github.com/meta-llama/llama-prompt-ops"><img alt="Llama Tools Syntethic Data Kit" src="https://img.shields.io/badge/Llama_Tools-llama--prompt--ops-orange?logo=meta" /></a>
 	<a href="https://github.com/meta-llama/llama-prompt-ops"><img alt="Llama Tools Syntethic Data Kit" src="https://img.shields.io/badge/Llama_Tools-llama--prompt--ops-orange?logo=meta" /></a>
 </p>
 </p>
 
 
+
+
+
+## [Building an Intelligent WhatsApp Bot with Llama 4 APIs](./whatsapp-llama4-bot/README.md)
+### A Step-by-Step Guide
+
+Create a WhatsApp bot that leverages the power of Llama 4 APIs to provide intelligent and interactive responses. This guide will walk you through the process of building a bot that supports text, image, and audio interactions, making it versatile for various use cases.
+
+- **Text Interaction**: Respond to text messages with accurate and contextually relevant answers.
+- **Image Reasoning**: Analyze images to provide insights, descriptions, or answers related to the content.
+- **Audio-to-Audio Interaction**: Transcribe audio messages to text, process them, and convert back to audio for seamless voice-based interaction.
+
+Get started with building your own WhatsApp bot using Llama 4 APIs today!
+
+
+
+
+## [Research Paper Analyzer with Llama4 Maverick](./research_paper_analyzer/README.md)
+### Analyze Research Papers with Ease
+
+Leverage Llama4 Maverick to retrieve references from an arXiv paper and ingest all their content for question-answering.
+
+- **Long Context Length**: Process entire papers at once.
+- **Comprehensive Analysis**: Get insights, descriptions, or answers related to the content.
+
+
+Get started with analyzing research papers using Llama4 Maverick today!
+
+
+
+
+## [Book Character Mind Map With Llama4 Maverick](./book_character_mindmap/README.md)
+### Explore Book Characters and Storylines
+
+Use Llama4 Maverick to process entire books at once and visualize character relationships and storylines.
+
+- **Interactive Mind Maps**: Visualize relationships between characters and plot elements.
+- **Book Summaries**: Get concise overviews of plots and themes.
+
+Discover new insights into your favorite books!
+
+
+
+
 ## [Agentic Tutorial](./agents/):
 ## [Agentic Tutorial](./agents/):
 
 
 ### 101 and 201 tutorials on performing Tool Calling and building an Agentic Workflow using Llama Models
 ### 101 and 201 tutorials on performing Tool Calling and building an Agentic Workflow using Llama Models
@@ -50,10 +94,15 @@ Workflow showcasing how to use multiple Llama models to go from any PDF to a Pod
 ### Building a Llama 3 Enabled WhatsApp Chatbot
 ### Building a Llama 3 Enabled WhatsApp Chatbot
 This step-by-step tutorial shows how to use the [WhatsApp Business API](https://developers.facebook.com/docs/whatsapp/cloud-api/overview) to build a Llama 3 enabled WhatsApp chatbot.
 This step-by-step tutorial shows how to use the [WhatsApp Business API](https://developers.facebook.com/docs/whatsapp/cloud-api/overview) to build a Llama 3 enabled WhatsApp chatbot.
 
 
+
 ## [Messenger Chatbot](./customerservice_chatbots/messenger_chatbot/messenger_llama3.md):
 ## [Messenger Chatbot](./customerservice_chatbots/messenger_chatbot/messenger_llama3.md):
 
 
 ### Building a Llama 3 Enabled Messenger Chatbot
 ### Building a Llama 3 Enabled Messenger Chatbot
 This step-by-step tutorial shows how to use the [Messenger Platform](https://developers.facebook.com/docs/messenger-platform/overview) to build a Llama 3 enabled Messenger chatbot.
 This step-by-step tutorial shows how to use the [Messenger Platform](https://developers.facebook.com/docs/messenger-platform/overview) to build a Llama 3 enabled Messenger chatbot.
 
 
+
 ### RAG Chatbot Example (running [locally](./customerservice_chatbots/RAG_chatbot/RAG_Chatbot_Example.ipynb)
 ### RAG Chatbot Example (running [locally](./customerservice_chatbots/RAG_chatbot/RAG_Chatbot_Example.ipynb)
 A complete example of how to build a Llama 3 chatbot hosted on your browser that can answer questions based on your own data using retrieval augmented generation (RAG).
 A complete example of how to build a Llama 3 chatbot hosted on your browser that can answer questions based on your own data using retrieval augmented generation (RAG).
+
+
+

+ 18 - 0
end-to-end-use-cases/whatsapp_llama_4_bot/.env

@@ -0,0 +1,18 @@
+# WhatsApp Business Phone Number ID (NOT the phone number itself)
+PHONE_NUMBER_ID="place your whatsapp phone number id"
+
+# Full URL to send WhatsApp messages (use correct version and phone number ID)
+WHATSAPP_API_URL="place graphql request i.e. https://graph.facebook.com/v{version}/{phone_number_id}/messages"
+
+# Your custom backend/agent endpoint (e.g., for LLM-based processing)
+AGENT_URL=https://your-agent-url.com/api
+
+LLAMA_API_KEY="place your LLAMA API Key"
+
+TOGETHER_API_KEY="place your Together API Key, In case you want to use Together, instead of Llama APIs"
+
+GROQ_API_KEY="place your Groq API Key - this is for SST and TTS"
+
+OPENAI_API_KEY="place your OpenAI Ke to run the client"
+
+META_ACCESS_TOKEN="please your WhatsApp generated Access token from the app"

+ 117 - 0
end-to-end-use-cases/whatsapp_llama_4_bot/README.md

@@ -0,0 +1,117 @@
+# WhatsApp and Llama 4 APIs : Build your own multi-modal chatbot
+
+Welcome to the WhatsApp Llama4 Bot ! This bot leverages the power of the Llama 4 APIs to provide intelligent and interactive responses to users via WhatsApp. It supports text, image, and audio interactions, making it a versatile tool for various use cases.
+
+
+## Key Features
+- **Text Interaction**: Users can send text messages to the bot, which are processed using the Llama4 APIs to generate accurate and contextually relevant responses.
+- **Image Reasoning**: The bot can analyze images sent by users, providing insights, descriptions, or answers related to the image content.
+- **Audio-to-Audio Interaction**: Users can send audio messages, which are transcribed to text, processed by the Llama4, and converted back to audio for a seamless voice-based interaction.
+
+
+
+## Technical Overview
+
+### Architecture
+
+- **FastAPI**: The bot is built using FastAPI, a modern web framework for building APIs with Python.
+- **Asynchronous Processing**: Utilizes `httpx` for making asynchronous HTTP requests to external APIs, ensuring efficient handling of media files.
+- **Environment Configuration**: Uses `dotenv` to manage environment variables, keeping sensitive information like API keys secure.
+
+Please refer below a high-level of architecture which explains the integrations :
+![WhatsApp Llama4 Integration Diagram](src/docs/img/WhatApp_Llama4_integration.jpeg)
+
+
+
+
+
+### Important Integrations
+
+- **WhatsApp API**: Facilitates sending and receiving messages, images, and audio files. 
+- **Llama4 Model**: Provides advanced natural language processing capabilities for generating responses.
+- **Groq API**: Handles speech-to-text (STT) and text-to-speech (TTS) conversions, enabling the audio-to-audio feature.
+
+
+
+
+
+## Here are the steps to setup with WhatsApp Business Cloud API
+
+
+First, open the [WhatsApp Business Platform Cloud API Get Started Guide](https://developers.facebook.com/docs/whatsapp/cloud-api/get-started#set-up-developer-assets) and follow the first four steps to:
+
+1. Add the WhatsApp product to your business app;
+2. Add a recipient number;
+3. Send a test message;
+4. Configure a webhook to receive real time HTTP notifications.
+
+For the last step, you need to further follow the [Sample Callback URL for Webhooks Testing Guide](https://developers.facebook.com/docs/whatsapp/sample-app-endpoints) to create a free account on glitch.com to get your webhook's callback URL.
+
+Now open the [Meta for Develops Apps](https://developers.facebook.com/apps/) page and select the WhatsApp business app and you should be able to copy the curl command (as shown in the App Dashboard - WhatsApp - API Setup - Step 2 below) and run the command on a Terminal to send a test message to your WhatsApp.
+
+![](../../../src/docs/img/whatsapp_dashboard.jpg)
+
+Note down the "Temporary access token", "Phone number ID", and "a recipient phone number" in the API Setup page above, which will be used later.
+
+
+
+
+
+## Setup and Installation
+
+
+
+### Step 1: Clone the Repository
+
+```bash
+git clone https://github.com/meta-llama/internal-llama-cookbook.git
+cd internal-llama-cookbook/end-to-end-use-cases/whatsapp-llama4-bot
+```
+
+### Step 2: Install Dependencies
+
+Ensure you have Python installed, then run the following command to install the required packages:
+
+```bash
+pip install -r requirements.txt
+```
+
+
+
+### Step 3: Configure Environment Variables
+
+Create a `.env` file in the project directory and add your API keys and other configuration details as follows:
+
+```plaintext
+ACCESS_TOKEN=your_whatsapp_access_token
+WHATSAPP_API_URL=your_whatsapp_api_url
+TOGETHER_API_KEY=your_llama4_api_key
+GROQ_API_KEY=your_groq_api_key
+PHONE_NUMBER_ID=your_phone_number_id'
+```
+
+
+
+### Step 4: Run the Application
+
+On your EC2 instance, run the following command on a Terminal to start the FastAPI server 
+
+```bash
+uvicorn ec2_endpoints:app —host 0.0.0.0 —port 5000
+```
+
+Note: If you use Amazon EC2 as your web server, make sure you have port 5000 added to your EC2 instance's security group's inbound rules.
+
+
+
+
+## License
+
+This project is licensed under the MIT License.
+
+
+## Contributing
+
+We welcome contributions to enhance the capabilities of this bot. Please feel free to submit issues or pull requests.
+
+

+ 49 - 0
end-to-end-use-cases/whatsapp_llama_4_bot/ec2_endpoints.py

@@ -0,0 +1,49 @@
+from fastapi import FastAPI, HTTPException 
+from fastapi.responses import FileResponse
+from pydantic import BaseModel
+from typing import Optional
+from service import text_to_speech, get_llm_response, handle_image_message,handle_audio_message,send_audio_message
+from enum import Enum
+app = FastAPI()
+
+class TextToSpeechRequest(BaseModel):
+    text: str
+    output_path: Optional[str] = "reply.mp3"
+
+class TextToSpeechResponse(BaseModel):
+    file_path: Optional[str]
+    error: Optional[str] = None
+
+class KindEnum(str, Enum):
+    audio = "audio"
+    image = "image"
+
+class LLMRequest(BaseModel):
+    user_input: str
+    media_id: Optional[str] = None
+    kind: Optional[KindEnum] = None
+
+
+class LLMResponse(BaseModel):
+    response: Optional[str]
+    error: Optional[str] = None
+
+@app.post("/llm-response", response_model=LLMResponse)
+async def api_llm_response(req: LLMRequest):
+    text_message = req.user_input
+    image_base64 = None
+    if req.kind == KindEnum.image:
+        image_base64 = await handle_image_message(req.media_id)
+        result = get_llm_response(text_message, image_input=image_base64)
+        # print(result)
+    elif req.kind == KindEnum.audio:
+        text_message = await handle_audio_message(req.media_id)
+        result = get_llm_response(text_message)
+        audio_path = text_to_speech(text=result, output_path="reply.mp3")
+        return FileResponse(audio_path, media_type="audio/mpeg", filename="reply.mp3")
+    else:
+        result = get_llm_response(text_message)
+    
+    if result is None:
+        return LLMResponse(response=None, error="LLM response generation failed.")
+    return LLMResponse(response=result)

+ 243 - 0
end-to-end-use-cases/whatsapp_llama_4_bot/ec2_services.py

@@ -0,0 +1,243 @@
+from together import Together
+from openai import OpenAI 
+import os
+import base64
+import asyncio
+import requests
+import httpx
+from PIL import Image
+from dotenv import load_dotenv
+from io import BytesIO
+from pathlib import Path
+from groq import Groq
+load_dotenv()
+
+TOGETHER_API_KEY = os.getenv("TOGETHER_API_KEY")
+LLAMA_API_KEY = os.getenv("LLAMA_API_KEY")
+#LLAMA_API_URL = os.getenv("API_URL")
+GROQ_API_KEY = os.getenv("GROQ_API_KEY")
+META_ACCESS_TOKEN = os.getenv("META_ACCESS_TOKEN")
+PHONE_NUMBER_ID = os.getenv("PHONE_NUMBER_ID")
+WHATSAPP_API_URL = os.getenv("WHATSAPP_API_URL")
+
+def text_to_speech(text: str, output_path: str = "reply.mp3") -> str:
+    """
+    Synthesizes a given text into an audio file using Groq's TTS service.
+
+    Args:
+        text (str): The text to be synthesized.
+        output_path (str): The path where the output audio file will be saved. Defaults to "reply.mp3".
+
+    Returns:
+        str: The path to the output audio file, or None if the synthesis failed.
+    """
+    try:
+        client = Groq(api_key=GROQ_API_KEY)
+        response = client.audio.speech.create(
+            model="playai-tts",
+            voice="Aaliyah-PlayAI",
+            response_format="mp3",
+            input=text
+        )
+        
+        # Convert string path to Path object and stream the response to a file
+        path_obj = Path(output_path)
+        response.write_to_file(path_obj)
+        return str(path_obj)
+    except Exception as e:
+        print(f"TTS failed: {e}")
+        return None
+
+
+def speech_to_text(input_path: str) -> str:
+    """
+    Transcribe an audio file using Groq.
+
+    Args:
+        input_path (str): Path to the audio file to be transcribed.
+        output_path (str, optional): Path to the output file where the transcription will be saved. Defaults to "transcription.txt".
+
+    Returns:
+        str: The transcribed text.
+    """
+
+    client = Groq(api_key=GROQ_API_KEY)
+    with open(input_path, "rb") as file:
+        transcription = client.audio.transcriptions.create(
+            model="distil-whisper-large-v3-en",
+            response_format="verbose_json",
+            file=(input_path, file.read())
+        )
+        transcription.text
+
+    return transcription.text
+      
+
+
+
+
+def get_llm_response(text_input: str, image_input : str = None) -> str:
+    """
+    Get the response from the Together AI LLM given a text input and an optional image input.
+
+    Args:
+        text_input (str): The text to be sent to the LLM.
+        image_input (str, optional): The base64 encoded image to be sent to the LLM. Defaults to None.
+
+    Returns:
+        str: The response from the LLM.
+    """
+    messages = []
+    # print(bool(image_input))
+    if image_input:
+        messages.append({
+            "type": "image_url",
+            "image_url": {"url": f"data:image/jpeg;base64,{image_input}"}
+        })
+    messages.append({
+        "type": "text",
+        "text": text_input
+    })
+    try:
+        #client = Together(api_key=TOGETHER_API_KEY)
+        client = OpenAI(base_url= "https://api.llama.com/compat/v1/")
+        completion = client.chat.completions.create(
+            model="Llama-4-Maverick-17B-128E-Instruct-FP8",
+            messages=[
+                {
+                    "role": "user",
+                    "content": messages
+                }
+            ]
+        )
+        
+        if completion.choices and len(completion.choices) > 0:
+            return completion.choices[0].message.content
+        else:
+            print("Empty response from Together API")
+            return None
+    except Exception as e:
+        print(f"LLM error: {e}")
+        return None
+
+
+
+
+
+
+
+async def fetch_media(media_id: str) -> str:
+    """
+    Fetches the URL of a media given its ID.
+
+    Args:
+        media_id (str): The ID of the media to fetch.
+
+    Returns:
+        str: The URL of the media.
+    """
+    url = "https://graph.facebook.com/v22.0/{media_id}"
+    async with httpx.AsyncClient() as client:
+        try:
+            response = await client.get(
+                url.format(media_id=media_id),
+                headers={"Authorization": f"Bearer {META_ACCESS_TOKEN}"}
+            )
+            if response.status_code == 200:
+                return response.json().get("url")
+            else:
+                print(f"Failed to fetch media: {response.text}")
+        except Exception as e:
+            print(f"Exception during media fetch: {e}")
+    return None
+
+async def handle_image_message(media_id: str) -> str:
+    """
+    Handle an image message by fetching the image media, converting it to base64,
+    and returning the base64 string.
+
+    Args:
+        media_id (str): The ID of the image media to fetch.
+
+    Returns:
+        str: The base64 string of the image.
+    """
+    media_url = await fetch_media(media_id)
+    # print(media_url)
+    async with httpx.AsyncClient() as client:
+        headers = {"Authorization": f"Bearer {META_ACCESS_TOKEN}"}
+        response = await client.get(media_url, headers=headers)
+        response.raise_for_status()
+
+        # Convert image to base64
+        image = Image.open(BytesIO(response.content))
+        buffered = BytesIO()
+        image.save(buffered, format="JPEG")  # Save as JPEG
+        # image.save("./test.jpeg", format="JPEG")  # Optional save
+        base64_image = base64.b64encode(buffered.getvalue()).decode("utf-8")
+        
+        return base64_image
+
+async def handle_audio_message(media_id: str):
+    """
+    Handle an audio message by fetching the audio media, writing it to a temporary file,
+    and then using Groq to transcribe the audio to text.
+
+    Args:
+        media_id (str): The ID of the audio media to fetch.
+
+    Returns:
+        str: The transcribed text.
+    """
+    media_url = await fetch_media(media_id)
+    # print(media_url)
+    async with httpx.AsyncClient() as client:
+        headers = {"Authorization": f"Bearer {META_ACCESS_TOKEN}"}
+        response = await client.get(media_url, headers=headers)
+
+        response.raise_for_status()
+        audio_bytes = response.content
+        temp_audio_path = "temp_audio.m4a"
+        with open(temp_audio_path, "wb") as f:
+            f.write(audio_bytes)
+        return speech_to_text(temp_audio_path)
+
+async def send_audio_message(to: str, file_path: str):
+    """
+    Send an audio message to a WhatsApp user.
+
+    Args:
+        to (str): The phone number of the recipient.
+        file_path (str): The path to the audio file to be sent.
+
+    Returns:
+        None
+
+    Raises:
+        None
+    """
+    url = f"https://graph.facebook.com/v20.0/{PHONE_NUMBER_ID}/media"
+    with open(file_path, "rb") as f:
+        files = { "file": ("reply.mp3", open(file_path, "rb"), "audio/mpeg")}
+        params = {
+            "messaging_product": "whatsapp",
+            "type": "audio",
+            "access_token": META_ACCESS_TOKEN
+        }
+        response = requests.post(url, params=params, files=files)
+
+    if response.status_code == 200:
+        media_id = response.json().get("id")
+        payload = {
+            "messaging_product": "whatsapp",
+            "to": to,
+            "type": "audio",
+            "audio": {"id": media_id}
+        }
+        headers = {
+            "Authorization": f"Bearer {META_ACCESS_TOKEN}",
+            "Content-Type": "application/json"
+        }
+        requests.post(WHATSAPP_API_URL, headers=headers, json=payload)
+    else:
+        print("Audio upload failed:", response.text)

+ 48 - 0
end-to-end-use-cases/whatsapp_llama_4_bot/requirements.txt

@@ -0,0 +1,48 @@
+aiohappyeyeballs==2.6.1
+aiohttp==3.11.16
+aiosignal==1.3.2
+annotated-types==0.7.0
+anyio==4.9.0
+async-timeout==5.0.1
+attrs==25.3.0
+certifi==2025.1.31
+charset-normalizer==3.4.1
+click==8.1.8
+colorama==0.4.6
+distro==1.9.0
+dotenv==0.9.9
+eval_type_backport==0.2.2
+exceptiongroup==1.2.2
+fastapi==0.115.12
+filelock==3.18.0
+frozenlist==1.5.0
+groq==0.22.0
+h11==0.14.0
+httpcore==1.0.8
+httpx==0.28.1
+idna==3.10
+markdown-it-py==3.0.0
+mdurl==0.1.2
+multidict==6.4.3
+numpy==2.2.4
+pillow==11.2.1
+propcache==0.3.1
+pyarrow==19.0.1
+pydantic==2.11.3
+pydantic_core==2.33.1
+Pygments==2.19.1
+python-dotenv==1.1.0
+requests==2.32.3
+rich==13.9.4
+shellingham==1.5.4
+sniffio==1.3.1
+starlette==0.46.2
+tabulate==0.9.0
+together==1.5.5
+tqdm==4.67.1
+typer==0.15.2
+typing-inspection==0.4.0
+typing_extensions==4.13.2
+urllib3==2.4.0
+uvicorn==0.34.1
+yarl==1.19.0

+ 70 - 0
end-to-end-use-cases/whatsapp_llama_4_bot/webhook_main.py

@@ -0,0 +1,70 @@
+from fastapi import FastAPI, Request, BackgroundTasks
+from fastapi.responses import JSONResponse
+from pydantic import BaseModel
+from utils import send_message,llm_reply_to_text,handle_image_message,get_llm_response,send_audio_message,fetch_media,text_to_speech,llm_reply_to_text_v2,audio_conversion
+import os
+import requests
+import httpx
+from dotenv import load_dotenv
+#from utils import handle_image_message
+
+load_dotenv()
+app = FastAPI()
+
+ACCESS_TOKEN = os.getenv("ACCESS_TOKEN")
+AGENT_URL = os.getenv("AGENT_URL")
+GROQ_API_KEY = os.getenv("GROQ_API_KEY")
+class WhatsAppMessage(BaseModel):
+    object: str
+    entry: list
+
+
+# @app.get("/webhook")
+# async def verify_webhook(request: Request):
+#     mode = request.query_params.get("hub.mode")
+#     token = request.query_params.get("hub.verify_token")
+#     challenge = request.query_params.get("hub.challenge")
+#     print(mode)
+#     print(token)
+#     print(challenge)
+
+#     # if mode and token and mode == "subscribe" and token == "1234":
+#     #     return {"hub_verfiy_mode":mode,"hub_verify_token":token, "hub_verify_challange":challenge }
+#     # return token
+
+#     return int(challenge)
+#     # return {"error": "Invalid verification token"}
+
+
+
+
+
+@app.post("/webhook")
+async def webhook_handler(request: Request, background_tasks: BackgroundTasks):
+    data = await request.json()
+    message_data = WhatsAppMessage(**data)
+    
+    change = message_data.entry[0]["changes"][0]["value"]
+    print(change)
+    if 'messages' in change:
+        message = change["messages"][-1]
+        user_phone = message["from"]
+        print(message)
+        if "text" in message:
+            user_message = message["text"]["body"].lower()
+            print(user_message)
+            background_tasks.add_task(llm_reply_to_text_v2, user_message, user_phone,None,None)
+        elif "image" in message:
+            media_id = message["image"]["id"]
+            print(media_id)
+            caption = message["image"].get("caption", "")
+            # background_tasks.add_task(handle_image_message, media_id, user_phone, caption)
+            background_tasks.add_task(llm_reply_to_text_v2,caption,user_phone,media_id,'image')
+        elif message.get("audio"):
+            media_id = message["audio"]["id"]
+            print(media_id)
+            path = await audio_conversion("",media_id,'audio')
+            # Send final audio reply
+            print(user_phone)
+            await send_audio_message(user_phone, path)
+        return JSONResponse(content={"status": "ok"}), 200

+ 116 - 0
end-to-end-use-cases/whatsapp_llama_4_bot/webhook_utils.py

@@ -0,0 +1,116 @@
+import os
+import base64
+import asyncio
+import requests
+import httpx
+from PIL import Image
+from dotenv import load_dotenv
+from io import BytesIO
+
+load_dotenv()
+
+META_ACCESS_TOKEN = os.getenv("META_ACCESS_TOKEN")
+WHATSAPP_API_URL = os.getenv("WHATSAPP_API_URL")
+TOGETHER_API_KEY = os.getenv("TOGETHER_API_KEY")
+MEDIA_URL = "https://graph.facebook.com/v20.0/{media_id}"
+BASE_URL = os.getenv("BASE_URL")
+PHONE_NUMBER_ID = os.getenv("PHONE_NUMBER_ID")
+GROQ_API_KEY = os.getenv("GROQ_API_KEY")
+
+def send_message(to: str, text: str):
+    if not text:
+        print("Error: Message text is empty.")
+        return
+
+    payload = {
+        "messaging_product": "whatsapp",
+        "to": to,
+        "type": "text",
+        "text": {"body": text}
+    }
+
+    headers = {
+        "Authorization": f"Bearer {META_ACCESS_TOKEN}",
+        "Content-Type": "application/json"
+    }
+
+    response = requests.post(WHATSAPP_API_URL, headers=headers, json=payload)
+    if response.status_code == 200:
+        print("Message sent")
+    else:
+        print(f"Send failed: {response.text}")
+
+
+
+async def send_message_async(user_phone: str, message: str):
+    loop = asyncio.get_running_loop()
+    await loop.run_in_executor(None, send_message, user_phone, message)
+
+
+
+        
+async def send_audio_message(to: str, file_path: str):
+    url = f"https://graph.facebook.com/v20.0/{PHONE_NUMBER_ID}/media"
+    with open(file_path, "rb") as f:
+        files = { "file": ("reply.mp3", open(file_path, "rb"), "audio/mpeg")}
+        params = {
+            "messaging_product": "whatsapp",
+            "type": "audio",
+            "access_token": ACCESS_TOKEN
+        }
+        response = requests.post(url, params=params, files=files)
+
+    if response.status_code == 200:
+        media_id = response.json().get("id")
+        payload = {
+            "messaging_product": "whatsapp",
+            "to": to,
+            "type": "audio",
+            "audio": {"id": media_id}
+        }
+        headers = {
+            "Authorization": f"Bearer {ACCESS_TOKEN}",
+            "Content-Type": "application/json"
+        }
+        requests.post(WHATSAPP_API_URL, headers=headers, json=payload)
+    else:
+        print("Audio upload failed:", response.text)
+
+
+
+
+
+
+async def llm_reply_to_text_v2(user_input: str, user_phone: str, media_id: str = None,kind: str = None):
+    try:
+        # print("inside this function")
+        headers = {
+        'accept': 'application/json',
+        'Content-Type': 'application/json',
+    }
+
+        json_data = {
+            'user_input': user_input,
+            'media_id': media_id,
+            'kind': kind
+        }
+        
+        async with httpx.AsyncClient() as client:
+          response = await client.post("https://df00-171-60-176-142.ngrok-free.app/llm-response", json=json_data, headers=headers,timeout=60)
+          response_data = response.json()
+          # print(response_data)
+          if response.status_code == 200 and response_data['error'] == None:
+              message_content = response_data['response']
+              if message_content:
+                  loop = asyncio.get_running_loop()
+                  await loop.run_in_executor(None, send_message, user_phone, message_content)
+              else:
+                  print("Error: Empty message content from LLM API")
+                  await send_message_async(user_phone, "Received empty response from LLM API.")
+          else:
+              print("Error: Invalid LLM API response", response_data)
+              await send_message_async(user_phone, "Failed to process image due to an internal server error.")
+
+    except Exception as e:
+        print("LLM error:", e)
+        await send_message_async(user_phone, "Sorry, something went wrong while generating a response.")

BIN
src/docs/img/WhatApp_Llama4_integration.jpeg