hace 10 meses · 7a60d319bf
--- a/end-to-end-use-cases/powerpoint-to-voiceover-transcript/.gitignore
+++ b/end-to-end-use-cases/powerpoint-to-voiceover-transcript/.gitignore
@@ -73,6 +73,7 @@ cover/
 
				 
			
 
				 # Jupyter Notebook
			
 
				 .ipynb_checkpoints
			
 
				+pptx_to_vo_workflow.ipynb
			
 
				 
			
 
				 # IPython
			
 
				 profile_default/
			
--- a/end-to-end-use-cases/powerpoint-to-voiceover-transcript/README.md
+++ b/end-to-end-use-cases/powerpoint-to-voiceover-transcript/README.md
@@ -1,14 +1,15 @@
 
				 # PowerPoint to Voiceover Transcript
			
 
				 
			
 
				-A production-ready tool that converts PowerPoint presentations into AI-generated voiceover transcripts using Meta's Llama vision models. Designed for creating professional narration content from slide decks.
			
 
				+A Llama 4 powered solution that converts PowerPoint presentations into text-to-speech ready voiceover transcripts. Designed for creating professional narration content from slide decks.
			
 
				 
			
 
				 ## Overview
			
 
				 
			
 
				-This system extracts speaker notes and visual content from PowerPoint files, then uses advanced AI vision models to generate natural-sounding transcripts optimized for human voiceover or text-to-speech systems. The generated transcripts include proper pronunciation of technical terms, numbers, and model names.
			
 
				+This system extracts speaker notes and visual content from PowerPoint files, then uses the Llama 4 Maverick model to generate natural-sounding transcripts optimized for human voiceover or text-to-speech systems. The generated transcripts include proper pronunciation of technical terms, numbers, and model names.
			
 
				 
			
 
				 ### Key Features
			
 
				 
			
 
				-- **AI-Powered Analysis**: Uses Llama vision models to understand slide content and context
			
 
				+- **AI-Powered Analysis**: Uses Llama 4 Maverick to understand slide content and context
			
 
				+- **Narrative Continuity**: Advanced workflow maintains context across slides for smooth transitions
			
 
				 - **Speech Optimization**: Converts numbers, decimals, and technical terms to spoken form
			
 
				 - **Flexible Processing**: Supports both individual slides and batch processing
			
 
				 - **Cross-Platform**: Works on Windows, macOS, and Linux
			
@@ -82,11 +83,18 @@ This system extracts speaker notes and visual content from PowerPoint files, the
 
				 
			
 
				 ### Basic Usage
			
 
				 
			
 
				-Run the main workflow notebook:
			
 
				+#### Narrative Continuity Workflow
			
 
				+For presentations requiring smooth narrative flow and consistent terminology:
			
 
				 ```bash
			
 
				-jupyter notebook pptx_to_vo_transcript.ipynb
			
 
				+jupyter notebook narrative_continuity_workflow.ipynb
			
 
				 ```
			
 
				 
			
 
				+This workflow uses previous slide transcripts as context to maintain narrative continuity and ensure smooth transitions between slides. Features include:
			
 
				+- **Context-aware processing**: Uses 5 previous slides as context by default
			
 
				+- **Consistent terminology**: Maintains terminology consistency throughout the presentation
			
 
				+- **Smooth transitions**: Generates natural flow between slides
			
 
				+- **Enhanced output**: Includes narrative context analysis and relationship mapping
			
 
				+
			
 
				 Or use the Python API:
			
 
				 ```python
			
 
				 from src.core.pptx_processor import pptx_to_images_and_notes
			
@@ -107,23 +115,24 @@ transcripts.to_csv("transcripts.csv", index=False)
 
				 
			
 
				 ```
			
 
				 powerpoint-to-voiceover-transcript/
			
 
				-├── README.md                     # This file
			
 
				-├── config.yaml                   # Main configuration
			
 
				-├── pyproject.toml                # Dependencies and project metadata
			
 
				-├── uv.lock                       # uv dependency lock file
			
 
				-├── pptx_to_vo_transcript.ipynb   # Main workflow notebook
			
 
				-├── .env.example                  # Environment template
			
 
				-├── input/                        # Place your PPTX files here
			
 
				+├── README.md                          # This file
			
 
				+├── config.yaml                        # Main configuration
			
 
				+├── pyproject.toml                     # Dependencies and project metadata
			
 
				+├── uv.lock                            # uv dependency lock file
			
 
				+├── narrative_continuity_workflow.ipynb # Enhanced narrative-aware workflow
			
 
				+├── .env.example                       # Environment template
			
 
				+├── input/                             # Place your PPTX files here
			
 
				 └── src/
			
 
				     ├── config/
			
 
				-    │   └── settings.py           # Configuration management
			
 
				+    │   └── settings.py                # Configuration management
			
 
				     ├── core/
			
 
				-    │   ├── file_utils.py         # File system utilities
			
 
				-    │   ├── image_processing.py   # Image encoding for API
			
 
				-    │   ├── llama_client.py       # Llama API integration
			
 
				-    │   └── pptx_processor.py     # PPTX extraction and conversion
			
 
				+    │   ├── file_utils.py              # File system utilities
			
 
				+    │   ├── image_processing.py        # Image encoding for API
			
 
				+    │   ├── llama_client.py            # Llama API integration
			
 
				+    │   └── pptx_processor.py          # PPTX extraction and conversion
			
 
				     └── processors/
			
 
				-        └── transcript_generator.py # AI transcript generation
			
 
				+        ├── transcript_generator.py    # Standard AI transcript generation
			
 
				+        └── narrative_transcript_generator.py # Narrative-aware processing
			
 
				 ```
			
 
				 
			
 
				 ## Configuration
			
@@ -165,6 +174,13 @@ Main class for generating AI transcripts.
 
				 - `process_slides_dataframe(df, output_dir)` - Process all slides
			
 
				 - `process_single_slide(image_path, speaker_notes)` - Process one slide
			
 
				 
			
 
				+#### `NarrativeTranscriptProcessor(context_window_size=5)`
			
 
				+Enhanced class for narrative-aware transcript generation.
			
 
				+
			
 
				+**Methods:**
			
 
				+- `process_slides_dataframe_with_narrative(df, output_dir)` - Process with context
			
 
				+- `process_single_slide_with_context(image_path, speaker_notes, context)` - Process with previous slides
			
 
				+
			
 
				 ### Speech Optimization
			
 
				 
			
 
				 The AI automatically converts technical content for natural speech:
			
@@ -192,12 +208,14 @@ See `pyproject.toml` for complete dependency list.
 
				 
			
 
				 ## Output
			
 
				 
			
 
				-The system generates:
			
 
				+### Narrative Continuity Workflow Output
			
 
				+Enhanced output includes:
			
 
				 
			
 
				-1. **Slide Images**: High-resolution PNG/JPEG files
			
 
				-2. **Notes DataFrame**: Structured data with slide metadata
			
 
				-3. **AI Transcripts**: Speech-optimized voiceover content
			
 
				-4. **CSV Export**: Complete results for further processing
			
 
				+1. **Narrative-Aware Transcripts**: Context-aware voiceover content with smooth transitions
			
 
				+2. **Context Analysis**: Information about how previous slides influenced each transcript
			
 
				+3. **Narrative Summary**: Overall analysis of presentation flow and consistency
			
 
				+4. **Multiple Formats**: CSV, JSON exports with context information
			
 
				+5. **Context Files**: Detailed narrative context data for each slide
			
 
				 
			
 
				 ## Troubleshooting
			
 
				 
			
@@ -219,7 +237,8 @@ The system generates:
 
				 - Make sure you have Python 3.12+ installed
			
 
				 - Try `uv python install 3.12` to install Python via uv
			
 
				 
			
 
				+**"Context window too large"**
			
 
				+- Reduce `context_window_size` parameter in narrative workflow
			
 
				+- Default is 5 slides, try 3 for shorter presentations
			
 
				 
			
 
				 ---
			
 
				-
			
 
				-## **Ready to convert your presentations to professional voiceover content!** 🎙️
			
--- a/end-to-end-use-cases/powerpoint-to-voiceover-transcript/narrative_continuity_workflow.ipynb
+++ b/end-to-end-use-cases/powerpoint-to-voiceover-transcript/narrative_continuity_workflow.ipynb
@@ -0,0 +1,355 @@
 
				+{
			
 
				+ "cells": [
			
 
				+  {
			
 
				+   "cell_type": "markdown",
			
 
				+   "id": "6c33ba3a",
			
 
				+   "metadata": {},
			
 
				+   "source": [
			
 
				+    "# PowerPoint to Narrative-Aware Voiceover Transcript Generator\n",
			
 
				+    "\n",
			
 
				+    "This notebook demonstrates the complete workflow for converting PowerPoint presentations into AI-generated voiceover transcripts with narrative continuity using Llama 4 Maverick through the Llama API.\n",
			
 
				+    "\n",
			
 
				+    "## Overview\n",
			
 
				+    "\n",
			
 
				+    "This enhanced workflow performs the following operations:\n",
			
 
				+    "\n",
			
 
				+    "1. **Content Extraction**: Pulls speaker notes and visual elements from PowerPoint slides\n",
			
 
				+    "2. **Image Conversion**: Transforms slides into high-quality images for AI analysis\n",
			
 
				+    "3. **Narrative-Aware Processing**: Uses previous slide transcripts as context for continuity\n",
			
 
				+    "4. **Transcript Generation**: Creates natural-sounding voiceover content with smooth transitions\n",
			
 
				+    "5. **Speech Optimization**: Converts numbers, technical terms, and abbreviations to spoken form\n",
			
 
				+    "6. **Results Export**: Saves transcripts and context information in multiple formats\n",
			
 
				+    "\n",
			
 
				+    "## Prerequisites\n",
			
 
				+    "\n",
			
 
				+    "Before running this notebook, ensure you have:\n",
			
 
				+    "- Created a `.env` file with your `LLAMA_API_KEY`\n",
			
 
				+    "- Updated `config.yaml` with your presentation file path\n",
			
 
				+    "---"
			
 
				+   ]
			
 
				+  },
			
 
				+  {
			
 
				+   "cell_type": "markdown",
			
 
				+   "id": "d8965447",
			
 
				+   "metadata": {},
			
 
				+   "source": [
			
 
				+    "## Setup and Configuration\n",
			
 
				+    "\n",
			
 
				+    "Import required libraries and load environment configuration."
			
 
				+   ]
			
 
				+  },
			
 
				+  {
			
 
				+   "cell_type": "code",
			
 
				+   "execution_count": null,
			
 
				+   "id": "21a962b2",
			
 
				+   "metadata": {},
			
 
				+   "outputs": [],
			
 
				+   "source": [
			
 
				+    "# Import required libraries\n",
			
 
				+    "import pandas as pd\n",
			
 
				+    "import os\n",
			
 
				+    "from pathlib import Path\n",
			
 
				+    "from dotenv import load_dotenv\n",
			
 
				+    "import matplotlib.pyplot as plt\n",
			
 
				+    "from IPython.display import display\n",
			
 
				+    "\n",
			
 
				+    "# Load environment variables from .env file\n",
			
 
				+    "load_dotenv()\n",
			
 
				+    "\n",
			
 
				+    "# Verify setup\n",
			
 
				+    "if os.getenv('LLAMA_API_KEY'):\n",
			
 
				+    "    print(\"SUCCESS: Environment loaded successfully!\")\n",
			
 
				+    "    print(\"SUCCESS: Llama API key found\")\n",
			
 
				+    "else:\n",
			
 
				+    "    print(\"WARNING: LLAMA_API_KEY not found in .env file\")\n",
			
 
				+    "    print(\"Please check your .env file and add your API key\")"
			
 
				+   ]
			
 
				+  },
			
 
				+  {
			
 
				+   "cell_type": "code",
			
 
				+   "execution_count": null,
			
 
				+   "id": "71c1c8bd",
			
 
				+   "metadata": {},
			
 
				+   "outputs": [],
			
 
				+   "source": [
			
 
				+    "# Import custom modules\n",
			
 
				+    "try:\n",
			
 
				+    "    from src.core.pptx_processor import extract_pptx_notes, pptx_to_images_and_notes\n",
			
 
				+    "    from src.processors.narrative_transcript_generator import (\n",
			
 
				+    "        NarrativeTranscriptProcessor,\n",
			
 
				+    "        process_slides_with_narrative\n",
			
 
				+    "    )\n",
			
 
				+    "    from src.config.settings import load_config, get_config\n",
			
 
				+    "\n",
			
 
				+    "    print(\"SUCCESS: All modules imported successfully!\")\n",
			
 
				+    "    print(\"- PPTX processor ready\")\n",
			
 
				+    "    print(\"- Narrative transcript generator ready\")\n",
			
 
				+    "    print(\"- Configuration manager ready\")\n",
			
 
				+    "\n",
			
 
				+    "except ImportError as e:\n",
			
 
				+    "    print(f\"ERROR: Import error: {e}\")\n",
			
 
				+    "    print(\"Make sure you're running from the project root directory\")"
			
 
				+   ]
			
 
				+  },
			
 
				+  {
			
 
				+   "cell_type": "code",
			
 
				+   "execution_count": null,
			
 
				+   "id": "53781172",
			
 
				+   "metadata": {},
			
 
				+   "outputs": [],
			
 
				+   "source": [
			
 
				+    "# Load and display configuration\n",
			
 
				+    "config = load_config()\n",
			
 
				+    "print(\"SUCCESS: Configuration loaded successfully!\")\n",
			
 
				+    "print(\"\\nCurrent Settings:\")\n",
			
 
				+    "print(f\"- Llama Model: {config['api']['llama_model']}\")\n",
			
 
				+    "print(f\"- Image DPI: {config['processing']['default_dpi']}\")\n",
			
 
				+    "print(f\"- Image Format: {config['processing']['default_format']}\")\n",
			
 
				+    "print(f\"- Context Window: 5 previous slides (default)\")"
			
 
				+   ]
			
 
				+  },
			
 
				+  {
			
 
				+   "cell_type": "code",
			
 
				+   "execution_count": null,
			
 
				+   "id": "9386e035",
			
 
				+   "metadata": {},
			
 
				+   "outputs": [],
			
 
				+   "source": [
			
 
				+    "# Configure file paths from config.yaml\n",
			
 
				+    "pptx_file = config['current_project']['pptx_file'] + config['current_project']['extension']\n",
			
 
				+    "output_dir = config['current_project']['output_dir']\n",
			
 
				+    "\n",
			
 
				+    "print(\"File Configuration:\")\n",
			
 
				+    "print(f\"- Input File: {pptx_file}\")\n",
			
 
				+    "print(f\"- Output Directory: {output_dir}\")\n",
			
 
				+    "\n",
			
 
				+    "# Verify input file exists\n",
			
 
				+    "if Path(pptx_file).exists():\n",
			
 
				+    "    file_size = Path(pptx_file).stat().st_size / 1024 / 1024\n",
			
 
				+    "    print(f\"- SUCCESS: Input file found ({file_size:.1f} MB)\")\n",
			
 
				+    "else:\n",
			
 
				+    "    print(f\"- ERROR: Input file not found: {pptx_file}\")\n",
			
 
				+    "    print(\"  Please update the 'pptx_file' path in config.yaml\")\n",
			
 
				+    "\n",
			
 
				+    "# Create output directory if needed\n",
			
 
				+    "Path(output_dir).mkdir(parents=True, exist_ok=True)\n",
			
 
				+    "print(f\"- SUCCESS: Output directory ready\")"
			
 
				+   ]
			
 
				+  },
			
 
				+  {
			
 
				+   "cell_type": "markdown",
			
 
				+   "id": "ea4851e6",
			
 
				+   "metadata": {},
			
 
				+   "source": [
			
 
				+    "---\n",
			
 
				+    "## Processing Pipeline\n",
			
 
				+    "\n",
			
 
				+    "Execute the main processing pipeline in three key steps."
			
 
				+   ]
			
 
				+  },
			
 
				+  {
			
 
				+   "cell_type": "markdown",
			
 
				+   "id": "0f098fdf",
			
 
				+   "metadata": {},
			
 
				+   "source": [
			
 
				+    "### Step 1: Extract Content and Convert to Images\n",
			
 
				+    "\n",
			
 
				+    "Extract speaker notes and slide text, then convert the presentation to high-quality images for AI analysis."
			
 
				+   ]
			
 
				+  },
			
 
				+  {
			
 
				+   "cell_type": "code",
			
 
				+   "execution_count": null,
			
 
				+   "id": "644ee94c",
			
 
				+   "metadata": {},
			
 
				+   "outputs": [],
			
 
				+   "source": [
			
 
				+    "print(\"PROCESSING: Converting PPTX to images and extracting notes...\")\n",
			
 
				+    "\n",
			
 
				+    "result = pptx_to_images_and_notes(\n",
			
 
				+    "    pptx_path=pptx_file,\n",
			
 
				+    "    output_dir=output_dir,\n",
			
 
				+    "    extract_notes=True\n",
			
 
				+    ")\n",
			
 
				+    "\n",
			
 
				+    "notes_df = result['notes_df']\n",
			
 
				+    "image_files = result['image_files']\n",
			
 
				+    "\n",
			
 
				+    "print(f\"\\nSUCCESS: Processing completed successfully!\")\n",
			
 
				+    "print(f\"- Processed {len(image_files)} slides\")\n",
			
 
				+    "print(f\"- Images saved to: {result['output_dir']}\")\n",
			
 
				+    "print(f\"- Found notes on {notes_df['has_notes'].sum()} slides\")\n",
			
 
				+    "print(f\"- DataFrame shape: {notes_df.shape}\")\n",
			
 
				+    "\n",
			
 
				+    "# Show sample data\n",
			
 
				+    "print(\"\\nSample Data (First 5 slides):\")\n",
			
 
				+    "display(notes_df[['slide_number', 'slide_title', 'has_notes', 'notes_word_count', 'slide_text_word_count']].head())"
			
 
				+   ]
			
 
				+  },
			
 
				+  {
			
 
				+   "cell_type": "markdown",
			
 
				+   "id": "1f95749d",
			
 
				+   "metadata": {},
			
 
				+   "source": [
			
 
				+    "### Step 2: Generate Narrative-Aware AI Transcripts\n",
			
 
				+    "\n",
			
 
				+    "Use the Llama vision model to analyze each slide image and generate natural-sounding voiceover transcripts with narrative continuity.\n",
			
 
				+    "\n",
			
 
				+    "This process:\n",
			
 
				+    "- Analyzes slide visual content using AI vision\n",
			
 
				+    "- Uses transcripts from previous slides as context\n",
			
 
				+    "- Combines slide content with speaker notes\n",
			
 
				+    "- Generates speech-optimized transcripts with smooth transitions\n",
			
 
				+    "- Maintains consistent terminology throughout the presentation\n",
			
 
				+    "- Converts numbers and technical terms to spoken form"
			
 
				+   ]
			
 
				+  },
			
 
				+  {
			
 
				+   "cell_type": "code",
			
 
				+   "execution_count": null,
			
 
				+   "id": "fe564b99",
			
 
				+   "metadata": {},
			
 
				+   "outputs": [],
			
 
				+   "source": [
			
 
				+    "print(\"PROCESSING: Starting narrative-aware AI transcript generation...\")\n",
			
 
				+    "print(f\"- Processing {len(notes_df)} slides\")\n",
			
 
				+    "print(f\"- Using model: {config['api']['llama_model']}\")\n",
			
 
				+    "print(f\"- Context window: 5 previous slides\")\n",
			
 
				+    "print(f\"- Using previous transcripts as context for narrative continuity\")\n",
			
 
				+    "print(\"- This may take several minutes...\")\n",
			
 
				+    "\n",
			
 
				+    "# Initialize processor and generate transcripts with narrative continuity\n",
			
 
				+    "processor = NarrativeTranscriptProcessor(context_window_size=5)\n",
			
 
				+    "processed_df = processor.process_slides_dataframe_with_narrative(\n",
			
 
				+    "    df=notes_df,\n",
			
 
				+    "    output_dir=output_dir,\n",
			
 
				+    "    save_context=True\n",
			
 
				+    ")\n",
			
 
				+    "\n",
			
 
				+    "print(f\"\\nSUCCESS: Narrative-aware transcript generation completed!\")\n",
			
 
				+    "print(f\"- Generated {len(processed_df)} transcripts\")\n",
			
 
				+    "print(f\"- Average length: {processed_df['ai_transcript'].str.len().mean():.0f} characters\")\n",
			
 
				+    "print(f\"- Total words: {processed_df['ai_transcript'].str.split().str.len().sum():,}\")\n",
			
 
				+    "print(f\"- Context information saved to: {output_dir}narrative_context/\")"
			
 
				+   ]
			
 
				+  },
			
 
				+  {
			
 
				+   "cell_type": "markdown",
			
 
				+   "id": "5cff4b70",
			
 
				+   "metadata": {},
			
 
				+   "source": [
			
 
				+    "### Step 3: Save Results\n",
			
 
				+    "\n",
			
 
				+    "Save results in multiple formats for different use cases."
			
 
				+   ]
			
 
				+  },
			
 
				+  {
			
 
				+   "cell_type": "code",
			
 
				+   "execution_count": null,
			
 
				+   "id": "8463ac3a",
			
 
				+   "metadata": {},
			
 
				+   "outputs": [],
			
 
				+   "source": [
			
 
				+    "print(\"PROCESSING: Saving results in multiple formats...\")\n",
			
 
				+    "\n",
			
 
				+    "# Create output directory\n",
			
 
				+    "os.makedirs(output_dir, exist_ok=True)\n",
			
 
				+    "\n",
			
 
				+    "# Save complete results with all metadata\n",
			
 
				+    "output_file = f\"{output_dir}narrative_transcripts.csv\"\n",
			
 
				+    "processed_df.to_csv(output_file, index=False)\n",
			
 
				+    "print(f\"- SUCCESS: Complete results saved to {output_file}\")\n",
			
 
				+    "\n",
			
 
				+    "# Save transcript-only version for voiceover work\n",
			
 
				+    "transcript_only = processed_df[['slide_number', 'slide_title', 'ai_transcript', 'context_slides_used']]\n",
			
 
				+    "transcript_file = f\"{output_dir}narrative_transcripts_clean.csv\"\n",
			
 
				+    "transcript_only.to_csv(transcript_file, index=False)\n",
			
 
				+    "print(f\"- SUCCESS: Clean transcripts saved to {transcript_file}\")\n",
			
 
				+    "\n",
			
 
				+    "# Save as JSON for API integration\n",
			
 
				+    "json_file = f\"{output_dir}narrative_transcripts.json\"\n",
			
 
				+    "processed_df.to_json(json_file, orient='records', indent=2)\n",
			
 
				+    "print(f\"- SUCCESS: JSON format saved to {json_file}\")\n",
			
 
				+    "\n",
			
 
				+    "# Summary statistics\n",
			
 
				+    "total_words = processed_df['ai_transcript'].str.split().str.len().sum()\n",
			
 
				+    "reading_time = total_words / 150  # Assuming 150 words per minute\n",
			
 
				+    "\n",
			
 
				+    "print(f\"\\nExport Summary:\")\n",
			
 
				+    "print(f\"- Total slides processed: {len(processed_df)}\")\n",
			
 
				+    "print(f\"- Slides with speaker notes: {processed_df['has_notes'].sum()}\")\n",
			
 
				+    "print(f\"- Total transcript words: {total_words:,}\")\n",
			
 
				+    "print(f\"- Average transcript length: {processed_df['ai_transcript'].str.len().mean():.0f} characters\")\n",
			
 
				+    "print(f\"- Estimated reading time: {reading_time:.1f} minutes\")\n",
			
 
				+    "print(f\"- Average context slides per slide: {processed_df['context_slides_used'].mean():.1f}\")"
			
 
				+   ]
			
 
				+  },
			
 
				+  {
			
 
				+   "cell_type": "markdown",
			
 
				+   "id": "8728d2ac",
			
 
				+   "metadata": {},
			
 
				+   "source": [
			
 
				+    "---\n",
			
 
				+    "# Completion Summary\n",
			
 
				+    "\n",
			
 
				+    "## Successfully Generated:\n",
			
 
				+    "- **Narrative-Aware Transcripts**: Context-aware voiceover content with smooth transitions\n",
			
 
				+    "- **Consistent Terminology**: Maintained terminology consistency throughout presentation\n",
			
 
				+    "- **Multiple Formats**: CSV, JSON exports for different use cases\n",
			
 
				+    "- **Context Analysis**: Detailed information about narrative flow and relationships\n",
			
 
				+    "\n",
			
 
				+    "## Output Files:\n",
			
 
				+    "- `narrative_transcripts.csv` - Complete dataset with context information\n",
			
 
				+    "- `narrative_transcripts_clean.csv` - Clean transcripts for voiceover work\n",
			
 
				+    "- `narrative_transcripts.json` - JSON format for API integration\n",
			
 
				+    "- `narrative_context/slide_contexts.json` - Individual slide context data\n",
			
 
				+    "- `narrative_context/narrative_summary.json` - Overall narrative analysis\n",
			
 
				+    "- Individual slide images in PNG/JPEG format\n",
			
 
				+    "\n",
			
 
				+    "## Next Steps:\n",
			
 
				+    "1. **Review** generated transcripts for narrative flow and accuracy\n",
			
 
				+    "2. **Edit** any content that needs refinement\n",
			
 
				+    "3. **Create** voiceover recordings or use TTS systems\n",
			
 
				+    "4. **Integrate** JSON data into your video production workflow\n",
			
 
				+    "\n",
			
 
				+    "## Tips for Better Results:\n",
			
 
				+    "- **Rich Speaker Notes**: Slides with detailed notes generate better contextual transcripts\n",
			
 
				+    "- **Clear Visuals**: High-contrast slides with readable text work best\n",
			
 
				+    "- **Consistent Style**: Maintain consistent formatting across your presentation\n",
			
 
				+    "- **Context Window**: Adjust context window size (3-7 slides) based on presentation complexity\n",
			
 
				+    "- **Review Context**: Check the narrative_context files to understand how continuity was maintained\n",
			
 
				+    "\n",
			
 
				+    "---"
			
 
				+   ]
			
 
				+  },
			
 
				+  {
			
 
				+   "cell_type": "code",
			
 
				+   "execution_count": null,
			
 
				+   "id": "7122cdf6-667e-4ae4-8ce7-67cfc32577c8",
			
 
				+   "metadata": {},
			
 
				+   "outputs": [],
			
 
				+   "source": []
			
 
				+  }
			
 
				+ ],
			
 
				+ "metadata": {
			
 
				+  "kernelspec": {
			
 
				+   "display_name": "promptTesting",
			
 
				+   "language": "python",
			
 
				+   "name": "prompttesting"
			
 
				+  },
			
 
				+  "language_info": {
			
 
				+   "codemirror_mode": {
			
 
				+    "name": "ipython",
			
 
				+    "version": 3
			
 
				+   },
			
 
				+   "file_extension": ".py",
			
 
				+   "mimetype": "text/x-python",
			
 
				+   "name": "python",
			
 
				+   "nbconvert_exporter": "python",
			
 
				+   "pygments_lexer": "ipython3",
			
 
				+   "version": "3.13.2"
			
 
				+  }
			
 
				+ },
			
 
				+ "nbformat": 4,
			
 
				+ "nbformat_minor": 5
			
 
				+}
			
--- a/end-to-end-use-cases/powerpoint-to-voiceover-transcript/src/processors/narrative_transcript_generator.py
+++ b/end-to-end-use-cases/powerpoint-to-voiceover-transcript/src/processors/narrative_transcript_generator.py
@@ -0,0 +1,254 @@
 
				+"""Simplified transcript generation processor with narrative continuity using previous slide transcripts."""
			
 
				+
			
 
				+import json
			
 
				+from pathlib import Path
			
 
				+from typing import Any, Dict, List, Optional, Union
			
 
				+
			
 
				+import pandas as pd
			
 
				+from tqdm import tqdm
			
 
				+
			
 
				+from ..config.settings import get_system_prompt
			
 
				+from ..core.llama_client import LlamaClient
			
 
				+
			
 
				+
			
 
				+class SlideContext:
			
 
				+    """Simple container for slide context information."""
			
 
				+
			
 
				+    def __init__(self, slide_number: int, title: str, transcript: str):
			
 
				+        self.slide_number = slide_number
			
 
				+        self.title = title
			
 
				+        self.transcript = transcript
			
 
				+
			
 
				+    def to_dict(self) -> Dict[str, Any]:
			
 
				+        return {
			
 
				+            "slide_number": int(self.slide_number),  # Convert to native Python int
			
 
				+            "title": str(self.title),  # Ensure it's a string
			
 
				+            "transcript": str(self.transcript),  # Ensure it's a string
			
 
				+        }
			
 
				+
			
 
				+
			
 
				+class NarrativeTranscriptProcessor:
			
 
				+    """Simplified processor for generating transcripts with narrative continuity using previous slide transcripts."""
			
 
				+
			
 
				+    def __init__(self, api_key: Optional[str] = None, context_window_size: int = 5):
			
 
				+        """
			
 
				+        Initialize narrative transcript processor.
			
 
				+
			
 
				+        Args:
			
 
				+            api_key: Llama API key. If None, will be loaded from config/environment.
			
 
				+            context_window_size: Number of previous slides to include in context (default: 5)
			
 
				+        """
			
 
				+        self.client = LlamaClient(api_key=api_key)
			
 
				+        self.context_window_size = context_window_size
			
 
				+        self.slide_contexts: List[SlideContext] = []
			
 
				+
			
 
				+    def _build_context_prompt(
			
 
				+        self, current_slide_number: int, slide_contexts: List[SlideContext]
			
 
				+    ) -> str:
			
 
				+        """
			
 
				+        Build enhanced system prompt with previous slide transcripts as context.
			
 
				+
			
 
				+        Args:
			
 
				+            current_slide_number: Number of the current slide being processed
			
 
				+            slide_contexts: List of previous slide contexts
			
 
				+
			
 
				+        Returns:
			
 
				+            Enhanced system prompt with context
			
 
				+        """
			
 
				+        base_prompt = get_system_prompt()
			
 
				+
			
 
				+        if not slide_contexts:
			
 
				+            return base_prompt
			
 
				+
			
 
				+        # Build context section
			
 
				+        context_section = "\n\n## PREVIOUS SLIDE CONTEXT\n\n"
			
 
				+        context_section += f"You are currently processing slide {current_slide_number} of this presentation. "
			
 
				+        context_section += "Here are the transcripts from the previous slides to maintain narrative continuity:\n\n"
			
 
				+
			
 
				+        # Add previous slides context (use last N slides based on context window)
			
 
				+        recent_contexts = slide_contexts[-self.context_window_size :]
			
 
				+
			
 
				+        for context in recent_contexts:
			
 
				+            context_section += (
			
 
				+                f'**Slide {context.slide_number} - "{context.title}":**\n'
			
 
				+            )
			
 
				+            context_section += f"{context.transcript}\n\n"
			
 
				+
			
 
				+        # Add continuity instructions
			
 
				+        continuity_instructions = """
			
 
				+## NARRATIVE CONTINUITY REQUIREMENTS
			
 
				+
			
 
				+When generating the transcript for this slide, ensure:
			
 
				+
			
 
				+1. **Smooth Transitions**: Reference previous concepts when appropriate (e.g., "Building on what we discussed about...", "As we saw in the previous section...")
			
 
				+
			
 
				+2. **Consistent Terminology**: Use the same terms and definitions established in previous slides
			
 
				+
			
 
				+3. **Logical Flow**: Ensure this slide's content logically follows from previous slides
			
 
				+
			
 
				+4. **Avoid Repetition**: Don't repeat information already covered unless it's for emphasis or summary
			
 
				+
			
 
				+5. **Forward References**: If this slide sets up future content, use appropriate language (e.g., "We'll explore this further...", "This leads us to...")
			
 
				+
			
 
				+6. **Contextual Awareness**: Understand where this slide fits in the overall presentation narrative
			
 
				+
			
 
				+"""
			
 
				+
			
 
				+        return base_prompt + context_section + continuity_instructions
			
 
				+
			
 
				+    def process_single_slide_with_context(
			
 
				+        self,
			
 
				+        slide_number: int,
			
 
				+        slide_title: str,
			
 
				+        image_path: Union[str, Path],
			
 
				+        speaker_notes: str = "",
			
 
				+    ) -> str:
			
 
				+        """
			
 
				+        Process a single slide with context from previous slides.
			
 
				+
			
 
				+        Args:
			
 
				+            slide_number: Number of the current slide
			
 
				+            slide_title: Title of the current slide
			
 
				+            image_path: Path to the slide image
			
 
				+            speaker_notes: Speaker notes for the slide
			
 
				+
			
 
				+        Returns:
			
 
				+            Generated transcript text with narrative continuity
			
 
				+        """
			
 
				+        # Build context-aware system prompt
			
 
				+        enhanced_prompt = self._build_context_prompt(slide_number, self.slide_contexts)
			
 
				+
			
 
				+        # Generate transcript with context
			
 
				+        transcript = self.client.generate_transcript(
			
 
				+            image_path=str(image_path),
			
 
				+            speaker_notes=speaker_notes,
			
 
				+            system_prompt=enhanced_prompt,
			
 
				+            stream=False,
			
 
				+        )
			
 
				+
			
 
				+        # Create and store slide context for future slides
			
 
				+        slide_context = SlideContext(
			
 
				+            slide_number=slide_number,
			
 
				+            title=slide_title,
			
 
				+            transcript=transcript,
			
 
				+        )
			
 
				+
			
 
				+        self.slide_contexts.append(slide_context)
			
 
				+
			
 
				+        return transcript
			
 
				+
			
 
				+    def process_slides_dataframe_with_narrative(
			
 
				+        self,
			
 
				+        df: pd.DataFrame,
			
 
				+        output_dir: Union[str, Path],
			
 
				+        save_context: bool = True,
			
 
				+    ) -> pd.DataFrame:
			
 
				+        """
			
 
				+        Process slides from a DataFrame with narrative continuity.
			
 
				+
			
 
				+        Args:
			
 
				+            df: DataFrame with slide information (from extract_pptx_notes)
			
 
				+            output_dir: Directory containing slide images
			
 
				+            save_context: Whether to save context information to file
			
 
				+
			
 
				+        Returns:
			
 
				+            DataFrame with added 'ai_transcript' and context columns
			
 
				+        """
			
 
				+        output_dir = Path(output_dir)
			
 
				+        df_copy = df.copy()
			
 
				+
			
 
				+        print(f"Processing {len(df_copy)} slides with narrative continuity...")
			
 
				+        print(f"Using context window of {self.context_window_size} previous slides")
			
 
				+
			
 
				+        for i in tqdm(range(len(df_copy)), desc="Processing slides with context"):
			
 
				+            # Get data for current slide
			
 
				+            slide_row = df_copy.iloc[i]
			
 
				+            slide_number = slide_row["slide_number"]
			
 
				+            slide_title = slide_row.get("slide_title", "")
			
 
				+            slide_filename = slide_row["image_filename"]
			
 
				+            speaker_notes = (
			
 
				+                slide_row["speaker_notes"]
			
 
				+                if pd.notna(slide_row["speaker_notes"])
			
 
				+                else ""
			
 
				+            )
			
 
				+
			
 
				+            image_path = output_dir / slide_filename
			
 
				+
			
 
				+            # Generate transcript with narrative context
			
 
				+            transcript = self.process_single_slide_with_context(
			
 
				+                slide_number=slide_number,
			
 
				+                slide_title=slide_title,
			
 
				+                image_path=image_path,
			
 
				+                speaker_notes=speaker_notes,
			
 
				+            )
			
 
				+
			
 
				+            # Add to dataframe
			
 
				+            df_copy.loc[i, "ai_transcript"] = transcript
			
 
				+            df_copy.loc[i, "context_slides_used"] = min(
			
 
				+                len(self.slide_contexts) - 1, self.context_window_size
			
 
				+            )
			
 
				+
			
 
				+        # Save context information if requested
			
 
				+        if save_context:
			
 
				+            self._save_context_information(output_dir)
			
 
				+
			
 
				+        return df_copy
			
 
				+
			
 
				+    def _save_context_information(self, output_dir: Path):
			
 
				+        """Save context information to files."""
			
 
				+        context_dir = output_dir / "narrative_context"
			
 
				+        context_dir.mkdir(exist_ok=True)
			
 
				+
			
 
				+        # Save slide contexts
			
 
				+        contexts_data = [context.to_dict() for context in self.slide_contexts]
			
 
				+        with open(context_dir / "slide_contexts.json", "w") as f:
			
 
				+            json.dump(contexts_data, f, indent=2)
			
 
				+
			
 
				+        # Save simple summary
			
 
				+        summary = {
			
 
				+            "total_slides": len(self.slide_contexts),
			
 
				+            "context_window_size": self.context_window_size,
			
 
				+            "slide_progression": [
			
 
				+                {
			
 
				+                    "slide_number": int(
			
 
				+                        ctx.slide_number
			
 
				+                    ),  # Convert to native Python int
			
 
				+                    "title": str(ctx.title),  # Ensure it's a string
			
 
				+                }
			
 
				+                for ctx in self.slide_contexts
			
 
				+            ],
			
 
				+        }
			
 
				+
			
 
				+        with open(context_dir / "narrative_summary.json", "w") as f:
			
 
				+            json.dump(summary, f, indent=2)
			
 
				+
			
 
				+        print(f"Context information saved to: {context_dir}")
			
 
				+
			
 
				+
			
 
				+# Convenience function for backward compatibility
			
 
				+def process_slides_with_narrative(
			
 
				+    df: pd.DataFrame,
			
 
				+    output_dir: Union[str, Path] = "slide_images",
			
 
				+    api_key: Optional[str] = None,
			
 
				+    context_window_size: int = 5,
			
 
				+    save_context: bool = True,
			
 
				+) -> pd.DataFrame:
			
 
				+    """
			
 
				+    Process slides from a DataFrame to generate transcripts with narrative continuity.
			
 
				+
			
 
				+    Args:
			
 
				+        df: DataFrame with slide information (from extract_pptx_notes)
			
 
				+        output_dir: Directory containing slide images
			
 
				+        api_key: Llama API key. If None, will be loaded from config/environment.
			
 
				+        context_window_size: Number of previous slides to include in context (default: 5)
			
 
				+        save_context: Whether to save context information to files
			
 
				+
			
 
				+    Returns:
			
 
				+        DataFrame with added transcript and context columns
			
 
				+    """
			
 
				+    processor = NarrativeTranscriptProcessor(
			
 
				+        api_key=api_key, context_window_size=context_window_size
			
 
				+    )
			
 
				+    return processor.process_slides_dataframe_with_narrative(
			
 
				+        df, output_dir, save_context
			
 
				+    )