瀏覽代碼

improved with knowledge grounding

Yuce Dincer 2 月之前
父節點
當前提交
2e4e5b723e
共有 37 個文件被更改,包括 5883 次插入536 次删除
  1. 3 3
      end-to-end-use-cases/powerpoint-to-voiceover-transcript/.env.example
  2. 4 0
      end-to-end-use-cases/powerpoint-to-voiceover-transcript/.gitignore
  3. 453 183
      end-to-end-use-cases/powerpoint-to-voiceover-transcript/README.md
  4. 63 2
      end-to-end-use-cases/powerpoint-to-voiceover-transcript/config.yaml
  5. 二進制
      end-to-end-use-cases/powerpoint-to-voiceover-transcript/knowledge_base/.faiss_cache/chunks.pkl
  6. 二進制
      end-to-end-use-cases/powerpoint-to-voiceover-transcript/knowledge_base/.faiss_cache/faiss.index
  7. 13 0
      end-to-end-use-cases/powerpoint-to-voiceover-transcript/knowledge_base/.faiss_cache/metadata.json
  8. 145 0
      end-to-end-use-cases/powerpoint-to-voiceover-transcript/knowledge_base/llama diet.md
  9. 208 0
      end-to-end-use-cases/powerpoint-to-voiceover-transcript/knowledge_base/llamas.md
  10. 985 0
      end-to-end-use-cases/powerpoint-to-voiceover-transcript/knowledge_enhanced_workflow.ipynb
  11. 140 29
      end-to-end-use-cases/powerpoint-to-voiceover-transcript/narrative_continuity_workflow.ipynb
  12. 二進制
      end-to-end-use-cases/powerpoint-to-voiceover-transcript/output/All About Llamas.pdf
  13. 19 0
      end-to-end-use-cases/powerpoint-to-voiceover-transcript/output/knowledge_base_stats.json
  14. 44 0
      end-to-end-use-cases/powerpoint-to-voiceover-transcript/output/knowledge_enhanced_narrative_transcripts.csv
  15. 172 0
      end-to-end-use-cases/powerpoint-to-voiceover-transcript/output/knowledge_enhanced_narrative_transcripts.json
  16. 11 0
      end-to-end-use-cases/powerpoint-to-voiceover-transcript/output/knowledge_enhanced_narrative_transcripts_clean.csv
  17. 10 10
      end-to-end-use-cases/powerpoint-to-voiceover-transcript/output/narrative_context/slide_contexts.json
  18. 9 9
      end-to-end-use-cases/powerpoint-to-voiceover-transcript/output/narrative_transcripts.csv
  19. 9 9
      end-to-end-use-cases/powerpoint-to-voiceover-transcript/output/narrative_transcripts.json
  20. 9 9
      end-to-end-use-cases/powerpoint-to-voiceover-transcript/output/narrative_transcripts_clean.csv
  21. 二進制
      end-to-end-use-cases/powerpoint-to-voiceover-transcript/output/slide-001.png
  22. 二進制
      end-to-end-use-cases/powerpoint-to-voiceover-transcript/output/slide-002.png
  23. 二進制
      end-to-end-use-cases/powerpoint-to-voiceover-transcript/output/slide-003.png
  24. 二進制
      end-to-end-use-cases/powerpoint-to-voiceover-transcript/output/slide-004.png
  25. 二進制
      end-to-end-use-cases/powerpoint-to-voiceover-transcript/output/slide-006.png
  26. 二進制
      end-to-end-use-cases/powerpoint-to-voiceover-transcript/output/slide-007.png
  27. 16 1
      end-to-end-use-cases/powerpoint-to-voiceover-transcript/pyproject.toml
  28. 92 0
      end-to-end-use-cases/powerpoint-to-voiceover-transcript/src/config/settings.py
  29. 2 2
      end-to-end-use-cases/powerpoint-to-voiceover-transcript/src/core/__init__.py
  30. 282 0
      end-to-end-use-cases/powerpoint-to-voiceover-transcript/src/core/groq_client.py
  31. 0 130
      end-to-end-use-cases/powerpoint-to-voiceover-transcript/src/core/llama_client.py
  32. 29 0
      end-to-end-use-cases/powerpoint-to-voiceover-transcript/src/knowledge/__init__.py
  33. 87 0
      end-to-end-use-cases/powerpoint-to-voiceover-transcript/src/knowledge/context_manager.py
  34. 506 0
      end-to-end-use-cases/powerpoint-to-voiceover-transcript/src/knowledge/faiss_knowledge.py
  35. 213 13
      end-to-end-use-cases/powerpoint-to-voiceover-transcript/src/processors/unified_transcript_generator.py
  36. 194 0
      end-to-end-use-cases/powerpoint-to-voiceover-transcript/src/utils/transcript_display.py
  37. 2165 136
      end-to-end-use-cases/powerpoint-to-voiceover-transcript/uv.lock

+ 3 - 3
end-to-end-use-cases/powerpoint-to-voiceover-transcript/.env.example

@@ -6,10 +6,10 @@
 # REQUIRED API CONFIGURATION
 # =============================================================================
 
-# Llama API Key (REQUIRED)
-# Get your API key from: https://www.llama-api.com/
+# GROQ API Key (REQUIRED)
 # This is essential for AI transcript generation
-LLAMA_API_KEY=your_llama_api_key_here
+# Required API Keys
+GROQ_API_KEY=your_groq_api_key_here
 
 # =============================================================================
 # SETUP INSTRUCTIONS

+ 4 - 0
end-to-end-use-cases/powerpoint-to-voiceover-transcript/.gitignore

@@ -185,3 +185,7 @@ credentials.json
 *.model
 *.bin
 *.safetensors
+.DS_Store
+**/.DS_Store
+
+*narrative_

+ 453 - 183
end-to-end-use-cases/powerpoint-to-voiceover-transcript/README.md

@@ -1,291 +1,561 @@
-# PowerPoint to Voiceover Transcript
+# PowerPoint to Knowledge-Enhanced Voiceover Transcript Generator
+
+> **AI-powered solution for converting PowerPoint presentations into professional, knowledge-enhanced voiceover transcripts using Groq's vision models**
 
-A Llama 4 powered solution that converts PowerPoint presentations into text-to-speech ready voiceover transcripts. Designed for creating professional narration content from slide decks.
 
 ## Overview
 
-This system extracts speaker notes and visual content from PowerPoint files, then uses the Llama 4 Maverick model to generate natural-sounding transcripts optimized for human voiceover or text-to-speech systems. The generated transcripts include proper pronunciation of technical terms, numbers, and model names.
+This system transforms PowerPoint presentations into natural-sounding voiceover transcripts optimized for human narration and text-to-speech systems. It combines AI-powered content analysis with domain-specific knowledge integration to produce professional-quality transcripts.
+
+### Key Capabilities
+
+- **Multi-Modal AI Processing**: Analyzes both visual slide content and speaker notes
+- **Knowledge Base Integration**: Enhances transcripts with domain-specific information
+- **Narrative Continuity**: Maintains smooth transitions and consistent terminology
+- **Speech Optimization**: Converts technical terms, numbers, and abbreviations to spoken form
+- **Flexible Processing Modes**: Standard, narrative-aware, and knowledge-enhanced options
+
+### Use Cases
 
-### Key Features
+- **Corporate Presentations**: Internal training, product demos, quarterly reviews
+- **Educational Content**: Course materials, conference talks, webinars
+- **Marketing Materials**: Product launches, sales presentations, customer demos
+- **Technical Documentation**: API walkthroughs, system architecture presentations
 
-- **AI-Powered Analysis**: Uses Llama 4 Maverick to understand slide content and context
-- **Unified Processing**: Single processor handles both standard and narrative-aware modes
-- **Narrative Continuity**: Optional context-aware processing maintains smooth transitions
-- **Speech Optimization**: Converts numbers, decimals, and technical terms to spoken form
-- **Visualization Tools**: Built-in utilities for displaying slide images in Jupyter notebooks
-- **Flexible Configuration**: Toggle between processing modes with simple flags
-- **Cross-Platform**: Works on Windows, macOS, and Linux
-- **Production Ready**: Comprehensive error handling, progress tracking, and retry logic
+## Features
+
+### Core Features
+- **AI-Powered Analysis**: Uses Groq's vision models for intelligent content understanding
+- **Knowledge Base Integration**: FAISS-powered semantic search through markdown knowledge files
+- **Narrative Continuity**: Context-aware processing with configurable sliding window
+- **Speech Optimization**: Automatic conversion of numbers, decimals, and technical terms
+- **Multi-Format Output**: CSV, JSON, and clean transcript exports
+- **Visualization Tools**: Built-in slide preview and analysis utilities
+
+### Advanced Features
+- **Unified Processing Pipeline**: Single processor handles all modes
+- **Graceful Degradation**: Continues processing even if components fail
+- **Performance Optimization**: In-memory vector storage with caching
+- **Cross-Platform Support**: Windows, macOS, and Linux compatibility
+- **Production Ready**: Comprehensive error handling and retry logic
 
 ## Quick Start
 
 ### Prerequisites
 
-- Python 3.12+
-- LibreOffice (for PPTX conversion)
-- Llama API key
+- **Python 3.12+**
+- **LibreOffice** (for PPTX conversion)
+- **Groq API Key**
 
 ### Installation
 
-#### Option 1: Using uv (Recommended - Faster)
-
-1. **Install uv (if not already installed):**
-   ```bash
-   # macOS/Linux
-   curl -LsSf https://astral.sh/uv/install.sh | sh
-
-   # Windows
-   powershell -c "irm https://astral.sh/uv/install.ps1 | iex"
-
-   # Or via pip
-   pip install uv
-   ```
-
-2. **Clone and install dependencies:**
-   ```bash
-   git clone <repository-url>
-   cd powerpoint-to-voiceover-transcript
-   uv sync
-   ```
-
-3. **Activate the virtual environment:**
-   ```bash
-   source .venv/bin/activate  # macOS/Linux
-   # or
-   .venv\Scripts\activate     # Windows
-   ```
-
-#### Option 2: Using pip (Traditional)
-
-1. **Clone and install dependencies:**
-   ```bash
-   git clone https://github.com/meta-llama/llama-cookbook.git
-   cd powerpoint-to-voiceover-transcript
-   pip install -e .
-   ```
-
-2. **Install LibreOffice:**
-   - **macOS**: `brew install --cask libreoffice`
-   - **Ubuntu**: `sudo apt-get install libreoffice`
-   - **Windows**: Download from [libreoffice.org](https://www.libreoffice.org/download/)
-
-3. **Set up environment:**
-   ```bash
-   cp .env.example .env
-   # Edit .env and add your LLAMA_API_KEY
-   ```
-
-4. **Configure your presentation:**
-   ```bash
-   # Edit config.yaml - update the pptx_file path
-   current_project:
-     pptx_file: "input/your_presentation_name"
-     extension: ".pptx"
-   ```
+#### Option 1: Using uv (Recommended)
+
+```bash
+# Install uv if not already installed
+curl -LsSf https://astral.sh/uv/install.sh | sh
+
+# Clone and setup project
+git clone <repository-url>
+cd powerpoint-to-voiceover-transcript
+uv sync
+
+# Activate environment
+source .venv/bin/activate  # macOS/Linux
+# or .venv\Scripts\activate  # Windows
+```
+
+#### Option 2: Using pip
+
+```bash
+git clone <repository-url>
+cd powerpoint-to-voiceover-transcript
+pip install -e .
+```
+
+### Environment Setup
+
+```bash
+# Copy environment template
+cp .env.example .env
+
+# Edit .env and add your API key
+echo "GROQ_API_KEY=your_api_key_here" >> .env
+
+# Configure your presentation in config.yaml
+# Update the pptx_file path to your presentation
+```
 
 ### Basic Usage
 
-#### Narrative Continuity Workflow
-For presentations requiring smooth narrative flow and consistent terminology:
+#### Using Jupyter Notebooks (Recommended)
 ```bash
+# Standard workflow
 jupyter notebook narrative_continuity_workflow.ipynb
-```
 
-This workflow uses previous slide transcripts as context to maintain narrative continuity and ensure smooth transitions between slides. Features include:
-- **Context-aware processing**: Uses 5 previous slides as context by default
-- **Consistent terminology**: Maintains terminology consistency throughout the presentation
-- **Smooth transitions**: Generates natural flow between slides
-- **Enhanced output**: Includes narrative context analysis and relationship mapping
+# Knowledge-enhanced workflow
+jupyter notebook knowledge_enhanced_narrative_workflow.ipynb
+```
 
-Or use the Python API:
+#### Standard Processing
 ```python
 from src.core.pptx_processor import pptx_to_images_and_notes
 from src.processors.unified_transcript_generator import UnifiedTranscriptProcessor
 
-# Convert PPTX and extract notes
+# Extract content from PowerPoint
 result = pptx_to_images_and_notes("presentation.pptx", "output/")
 
 # Generate transcripts
-processor = UnifiedTranscriptProcessor()
+processor = UnifiedTranscriptProcessor(use_narrative=False)
 transcripts = processor.process_slides_dataframe(result['notes_df'], "output/")
 
 # Save results
 transcripts.to_csv("transcripts.csv", index=False)
 ```
 
-## Project Structure
+#### Knowledge-Enhanced Narrative Processing
+```python
+# Enable both narrative continuity and knowledge integration
+processor = UnifiedTranscriptProcessor(
+    use_narrative=True,
+    context_window_size=5,
+    enable_knowledge=True
+)
+
+transcripts = processor.process_slides_dataframe(
+    result['notes_df'],
+    "output/",
+    save_context=True
+)
+```
+
+
+## System Architecture
+
+### Core Processing Pipeline
+
+The system follows a modular 3-stage pipeline:
+
+```
+┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐
+│   PowerPoint    │───▶│     Content     │───▶│    Knowledge    │
+│      File       │    │    Extraction   │    │    Retrieval    │
+└─────────────────┘    └─────────────────┘    └─────────────────┘
+                                                        │
+┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐
+│     Output      │◀───│    Transcript   │◀───│  LLM Processing │
+│      Files      │    │    Generation   │    │  Vision & Text  │
+└─────────────────┘    └─────────────────┘    └─────────────────┘
+```
+
+#### Stage 1: Content Extraction (`pptx_processor.py`)
+- Extracts speaker notes and slide text using `python-pptx`
+- Converts PPTX → PDF → Images via LibreOffice and PyMuPDF
+- Generates structured DataFrame with slide metadata
+- Supports configurable DPI (default: 200) and formats (PNG/JPEG)
+
+#### Stage 2: Knowledge Retrieval (`markdown_knowledge.py`)
+- Loads and chunks markdown files from knowledge base
+- Generates embeddings using sentence-transformers
+- Performs semantic search for relevant knowledge chunks
+- Integrates knowledge with slide content and speaker notes
+
+#### Stage 3: AI Processing (`groq_client.py` + `unified_transcript_generator.py`)
+- Integrates with Groq's vision models via `groq` client
+- Base64 encodes images for vision model processing
+- Applies narrative continuity with sliding context window
+- Handles API retries and comprehensive error management
+
+### Project Structure
 
 ```
 powerpoint-to-voiceover-transcript/
-├── README.md                          # This file
-├── config.yaml                        # Main configuration
-├── pyproject.toml                     # Dependencies and project metadata
-├── uv.lock                            # uv dependency lock file
-├── narrative_continuity_workflow.ipynb # Narrative-aware workflow
+├── PROJECT_DOCUMENTATION.md          # This comprehensive guide
+├── config.yaml                       # Main configuration
+├── pyproject.toml                     # Dependencies and metadata
 ├── .env.example                       # Environment template
-├── input/                             # Place your PPTX files here
-├── output/                            # Generated images and transcripts
+├── knowledge_enhanced_narrative_workflow.ipynb  # Advanced workflow
+├── narrative_continuity_workflow.ipynb          # Standard workflow
+├── input/                             # PowerPoint files
+├── output/                            # Generated content
+├── knowledge_base/                    # Domain knowledge files
 └── src/
     ├── config/
     │   └── settings.py                # Configuration management
     ├── core/
     │   ├── file_utils.py              # File system utilities
     │   ├── image_processing.py        # Image encoding for API
-    │   ├── llama_client.py            # Llama API integration
+    │   ├── groq_client.py             # Groq API integration
     │   └── pptx_processor.py          # PPTX extraction and conversion
+    ├── knowledge/
+    │   ├── markdown_knowledge.py      # Knowledge base management
+    │   └── context_manager.py         # Context integration
     ├── processors/
-    │   └── unified_transcript_generator.py # Unified processor (standard + narrative)
+    │   └── unified_transcript_generator.py  # Main processing engine
     └── utils/
-        └── visualization.py           # Slide image display utilities
+        └── visualization.py           # Slide display utilities
+```
+
+
+### Setup Guide
+
+#### 1. Enable Knowledge Base
+
+Edit `config.yaml`:
+```yaml
+knowledge:
+  enabled: true
+  knowledge_base_dir: "knowledge_base"
+```
+
+#### 2. Create Knowledge Base Structure
+
+```bash
+mkdir knowledge_base
+cd knowledge_base
+
+# Create domain-specific files
+touch company_overview.md
+touch technical_glossary.md
+touch product_specifications.md
+touch presentation_guidelines.md
+```
+#### 3. Add Knowledge Base Content
+For the purposes of the cookbook, we're using local markdown files as the knowledge base. You can use any format you prefer, as long as it can be loaded and processed by the system.
+
+### Processing Workflow
+
+1. **Content Analysis**: System analyzes slide content and speaker notes
+2. **Semantic Search**: Finds relevant knowledge chunks using embedding similarity
+3. **Context Building**: Combines knowledge with narrative context (if enabled)
+4. **Prompt Enhancement**: Integrates context into system prompt or user message
+5. **Transcript Generation**: AI generates enhanced transcript with domain knowledge
+
+### Configuration Options
+
+```yaml
+knowledge:
+  # Core settings
+  enabled: true
+  knowledge_base_dir: "knowledge_base"
+
+  # Embedding model configuration
+  embedding:
+    model_name: "all-MiniLM-L6-v2"  # Lightweight, fast model
+    device: "cpu"                   # Use "cuda" if GPU available
+    batch_size: 32
+    max_seq_length: 512
+
+  # Search parameters
+  search:
+    top_k: 5                        # Number of chunks to retrieve
+    similarity_threshold: 0.3       # Minimum similarity score (0.0-1.0)
+    enable_keyword_fallback: true   # Fallback to keyword search
+    max_chunk_size: 1000           # Maximum characters per chunk
+    chunk_overlap: 200             # Overlap between chunks
+
+  # Context integration
+  context:
+    strategy: "combined"            # "knowledge_only", "narrative_priority", "combined"
+    max_context_length: 8000       # Maximum total context length
+    knowledge_weight: 0.3          # Knowledge influence (0.0-1.0)
+    integration_method: "system_prompt"  # "system_prompt" or "user_message"
+
+  # Performance optimization
+  performance:
+    enable_caching: true           # Cache embeddings and search results
+    cache_dir: "cache/knowledge"   # Cache directory
+    cache_expiry_hours: 24         # Cache expiration (0 = never)
+    max_memory_mb: 512             # Maximum memory for embeddings
+    lazy_loading: true             # Load embeddings on demand
+
+  # Reliability settings
+  fallback:
+    graceful_degradation: true     # Continue if knowledge base fails
+    use_keyword_fallback: true     # Use keyword matching as fallback
+    log_errors_only: true          # Log errors but don't fail process
+```
+
+### Integration Strategies
+
+#### Knowledge Only
+```yaml
+context:
+  strategy: "knowledge_only"
 ```
+**Best for**: Technical documentation, product specifications, reference materials
+
+#### Narrative Priority
+```yaml
+context:
+  strategy: "narrative_priority"
+  knowledge_weight: 0.2
+```
+**Best for**: Storytelling presentations, educational sequences, marketing narratives
+
+#### Combined (Recommended)
+```yaml
+context:
+  strategy: "combined"
+  knowledge_weight: 0.3
+```
+**Best for**: Most presentations, mixed content types, general use cases
 
 ## Configuration
 
-The system uses `config.yaml` for settings:
+### Main Configuration File (`config.yaml`)
 
 ```yaml
 # API Configuration
 api:
   llama_model: "Llama-4-Maverick-17B-128E-Instruct-FP8"
   max_retries: 3
+  retry_delay: 1
+  rate_limit_delay: 1
 
-# Processing Settings
+# Processing Configuration
 processing:
   default_dpi: 200
+  supported_formats: ["png", "jpeg", "jpg"]
   default_format: "png"
   batch_size: 5
 
-# Your Project
+# File Paths
+paths:
+  default_output_dir: "slide_images"
+  cache_dir: "cache"
+  logs_dir: "logs"
+  temp_dir: "temp"
+
+# Current Project Settings
 current_project:
   pptx_file: "input/your_presentation"
   extension: ".pptx"
   output_dir: "output/"
+
+# Knowledge Base Configuration (see Knowledge Base section for details)
+knowledge:
+  enabled: true
+  knowledge_base_dir: "knowledge_base"
+  # ... additional knowledge settings
+
+# Logging Configuration
+logging:
+  level: "INFO"
+  format: "%(asctime)s - %(levelname)s - %(message)s"
+  file_enabled: true
+  console_enabled: true
 ```
 
-## API Reference
+### Environment Variables (`.env`)
+
+```bash
+# Required
+GROQ_API_KEY=your_groq_api_key_here
 
-### Core Functions
+# Optional
+LOG_LEVEL=INFO
+CACHE_ENABLED=true
+```
 
-#### `pptx_to_images_and_notes(pptx_path, output_dir)`
-Converts PowerPoint to images and extracts speaker notes.
+## Processing Modes
+
+#### Standard Mode
+```python
+processor = UnifiedTranscriptProcessor(
+    use_narrative=False,
+    enable_knowledge=False
+)
+```
+- **Use when**: Simple presentations, time-sensitive processing
+- **Benefits**: Fastest processing, no dependencies
+- **Limitations**: No context awareness, basic quality
 
-**Returns:** Dictionary with `image_files`, `notes_df`, and `output_dir`
+#### Knowledge-Enhanced Mode
+```python
+processor = UnifiedTranscriptProcessor(
+    use_narrative=False,
+    enable_knowledge=True
+)
+```
+- **Use when**: Technical presentations requiring domain expertise
+- **Benefits**: Enhanced accuracy, domain-specific terminology
+- **Limitations**: No narrative flow between slides
 
-#### `UnifiedTranscriptProcessor(use_narrative=True, context_window_size=5)`
-Main class for generating AI transcripts with configurable processing modes.
+#### Narrative Mode
+```python
+processor = UnifiedTranscriptProcessor(
+    use_narrative=True,
+    context_window_size=5,
+    enable_knowledge=False
+)
+```
+- **Use when**: Educational content, storytelling presentations
+- **Benefits**: Smooth transitions, consistent terminology
+- **Limitations**: No external knowledge integration
 
-**Parameters:**
-- `use_narrative` (bool): Enable narrative continuity mode (default: True)
-- `context_window_size` (int): Number of previous slides to use as context (default: 5)
+#### Full Enhancement Mode (Recommended)
+```python
+processor = UnifiedTranscriptProcessor(
+    use_narrative=True,
+    context_window_size=5,
+    enable_knowledge=True
+)
+```
+- **Use when**: Professional presentations requiring highest quality
+- **Benefits**: Maximum quality, context awareness, domain expertise
+- **Limitations**: Slower processing, requires knowledge base setup
 
-**Methods:**
-- `process_slides_dataframe(df, output_dir, save_context=True)` - Process all slides
-- `process_single_slide(image_path, speaker_notes, slide_number, slide_title)` - Process one slide
+## Deployment
 
-### Processing Modes
+### Development Environment
 
-#### Standard Mode (`use_narrative=False`)
-- **Best for**: Simple presentations, quick processing, independent slides
-- **Features**: Fast execution, no context dependencies
-- **Use cases**: Training materials, product demos, standalone slides
+```bash
+# Clone repository
+git clone <repository-url>
+cd powerpoint-to-voiceover-transcript
 
-#### Narrative Mode (`use_narrative=True`)
-- **Best for**: Story-driven presentations, complex topics, educational content
-- **Features**: Context awareness, smooth transitions, terminology consistency
-- **Use cases**: Conference talks, educational courses, marketing presentations
+# Setup with uv (recommended)
+uv sync
+source .venv/bin/activate
 
-### Visualization Utilities
+# Or setup with pip
+pip install -e .
 
-#### `display_slide_grid(image_files, max_cols=3, figsize_per_image=(4, 3))`
-Display slide images in a grid layout for Jupyter notebooks.
+# Install system dependencies
+# macOS: brew install --cask libreoffice
+# Ubuntu: sudo apt-get install libreoffice
+# Windows: Download from libreoffice.org
+```
 
-**Parameters:**
-- `image_files` (List): List of image file paths
-- `max_cols` (int): Maximum columns in grid (default: 3)
-- `figsize_per_image` (Tuple): Size of each image as (width, height) (default: (4, 3))
+### Performance Optimization
 
-**Example:**
-```python
-from src.utils.visualization import display_slide_grid, display_slide_preview
+#### Memory Management
+```yaml
+knowledge:
+  performance:
+    max_memory_mb: 1024        # Adjust based on available RAM
+    lazy_loading: true         # Load embeddings on demand
+    enable_caching: true       # Cache for repeated processing
+```
 
-# Display first 6 slides in a 3-column grid
-display_slide_grid(image_files[:6], max_cols=3, figsize_per_image=(4, 3))
+#### Processing Optimization
+```yaml
+processing:
+  batch_size: 10             # Process slides in batches
+  default_dpi: 150           # Lower DPI for faster processing
 
-# Or use the convenience function
-display_slide_preview(image_files, num_slides=6, max_cols=3)
+api:
+  max_retries: 5             # Increase retries for production
+  retry_delay: 2             # Longer delays for stability
 ```
 
-#### `display_slide_preview(image_files, num_slides=6, max_cols=3, figsize_per_image=(4, 3))`
-Display a preview of the first N slide images with automatic grid layout.
+## Troubleshooting
 
+### Common Issues and Solutions
 
-### Speech Optimization
+#### Installation Issues
 
-The AI automatically converts technical content for natural speech:
+**"LibreOffice not found"**
+```bash
+# macOS
+brew install --cask libreoffice
 
-- **Decimals**: `3.2` → "three dot two"
-- **Model names**: `LLaMA-3.2` → "LLaMA three dot two"
-- **Abbreviations**: `LLM` → "L L M"
-- **Large numbers**: `70B` → "seventy billion"
+# Ubuntu/Debian
+sudo apt-get install libreoffice
 
-Add your own rules in the system prompt.
+# Windows
+# Download from https://www.libreoffice.org/download/
+```
 
-## Requirements
+**"uv sync fails"**
+```bash
+# Ensure Python 3.12+ is available
+uv python install 3.12
+uv sync --python 3.12
+```
 
-### System Dependencies
-- **LibreOffice**: Required for PPTX to PDF conversion
-- **Python 3.12+**: Core runtime
+**"sentence_transformers not found"**
+```bash
+# Install with uv
+uv add sentence-transformers
 
-### Python Dependencies
-- `pandas>=2.3.1` - Data processing
-- `python-pptx>=1.0.2` - PowerPoint file handling
-- `pymupdf>=1.24.0` - PDF to image conversion
-- `llama-api-client>=0.1.0` - AI model access
-- `pillow>=11.3.0` - Image processing
-- `pyyaml>=6.0.0` - Configuration management
-- `matplotlib>=3.5.0` - Visualization utilities
+# Or with pip
+pip install sentence-transformers
 
-See `pyproject.toml` for complete dependency list.
+# Restart Jupyter kernel after installation
+```
 
-## Output
+#### Runtime Issues
 
-### Narrative Continuity Workflow Output
-Enhanced output includes:
+**"API key not found"**
+```bash
+# Check .env file exists and contains key
+cat .env | grep GROQ_API_KEY
 
-1. **Narrative-Aware Transcripts**: Context-aware voiceover content with smooth transitions
-2. **Context Analysis**: Information about how previous slides influenced each transcript
-3. **Narrative Summary**: Overall analysis of presentation flow and consistency
-4. **Multiple Formats**: CSV, JSON exports with context information
-5. **Context Files**: Detailed narrative context data for each slide
-6. **Visual Preview**: Grid display of slide images for verification
+# Or set environment variable directly
+export GROQ_API_KEY=your_key_here
+```
 
-## Troubleshooting
+**"Permission denied on output directory"**
+```bash
+# Ensure write permissions
+chmod 755 output/
+mkdir -p output/
+```
 
-### Common Issues
+**"Knowledge base not loading"**
+```bash
+# Check directory exists and contains .md files
+ls -la knowledge_base/
+ls knowledge_base/*.md
 
-**"LibreOffice not found"**
-- Install LibreOffice or update paths in `config.yaml`
+# Verify configuration
+grep -A 5 "knowledge:" config.yaml
+```
 
-**"API key not found"**
-- Set `LLAMA_API_KEY` in your `.env` file
+#### Performance Issues
+
+**"Processing too slow"**
+```yaml
+# Reduce context window size
+context_window_size: 3
 
-**"Permission denied"**
-- Ensure write permissions to output directories
+# Lower image quality
+processing:
+  default_dpi: 150
 
-**"Invalid image format"**
-- Use supported formats: `png`, `jpeg`, `jpg`
+# Disable knowledge base temporarily
+knowledge:
+  enabled: false
+```
 
-**"uv sync fails"**
-- Make sure you have Python 3.12+ installed
-- Try `uv python install 3.12` to install Python via uv
+**"Memory usage too high"**
+```yaml
+knowledge:
+  performance:
+    max_memory_mb: 256
+    lazy_loading: true
+  search:
+    top_k: 3
+    max_chunk_size: 500
+```
 
-**"Context window too large"**
-- Reduce `context_window_size` parameter in narrative workflow
-- Default is 5 slides, try 3 for shorter presentations
+#### Quality Issues
 
-**"Images not displaying in notebook"**
-- Ensure matplotlib is installed: `pip install matplotlib`
-- Check that image files exist in the output directory
-- Try restarting the Jupyter kernel
+**"Poor transcript quality"**
+```yaml
+# Increase knowledge retrieval
+knowledge:
+  search:
+    top_k: 7
+    similarity_threshold: 0.2
+
+# Increase context window
+context_window_size: 7
+```
 
----
+**"Inconsistent terminology"**
+- Ensure narrative mode is enabled: `use_narrative=True`
+- Add domain-specific terms to knowledge base
+- Increase knowledge weight: `knowledge_weight: 0.4`

+ 63 - 2
end-to-end-use-cases/powerpoint-to-voiceover-transcript/config.yaml

@@ -2,7 +2,7 @@
 
 # API Configuration
 api:
-  llama_model: "Llama-4-Maverick-17B-128E-Instruct-FP8" # This notebook uses Llama API to access the model
+  groq_model: "meta-llama/llama-4-maverick-17b-128e-instruct"
   max_retries: 3
   retry_delay: 1
   rate_limit_delay: 1
@@ -16,7 +16,7 @@ processing:
 
 # File Paths
 paths:
-  default_output_dir: "slide_images"
+  default_output_dir: "output/"
   cache_dir: "cache"
   logs_dir: "logs"
   temp_dir: "temp"
@@ -60,6 +60,67 @@ image_quality:
   jpeg_optimize: true
   png_compression: 6
 
+# Knowledge Base Configuration
+knowledge:
+  # Enable/disable knowledge base integration
+  enabled: true  # Set to true to enable knowledge base features
+
+  # Knowledge base directory path (relative to project root)
+  knowledge_base_dir: "knowledge_base"
+
+  # Vector store configuration (FAISS)
+  vector_store:
+    type: "faiss"                  # Vector database type
+    index_type: "flat"             # "flat", "ivf", "hnsw"
+    use_gpu: false                 # Enable GPU acceleration (requires faiss-gpu)
+    cache_enabled: true            # Enable persistent caching
+    rebuild_on_changes: true       # Auto-rebuild when files change
+
+  # Embedding model configuration
+  embedding:
+    model_name: "all-MiniLM-L6-v2" # Lightweight, fast model
+    device: "cpu"                  # Use "cuda" if GPU available
+    batch_size: 32
+    max_seq_length: 512
+
+  # Search configuration
+  search:
+    top_k: 5                      # Number of knowledge chunks to retrieve
+    similarity_threshold: 0.3     # Minimum similarity threshold (0.0 to 1.0)
+    enable_keyword_fallback: true # Enable fallback keyword search if similarity search fails
+    max_chunk_size: 1000          # Maximum characters per knowledge chunk
+    chunk_overlap: 200            # Overlap between chunks (characters)
+
+  # Context integration settings
+  context:
+    # Strategy for combining knowledge with narrative context
+    strategy: "combined"          # Options: "knowledge_only", "narrative_priority", "combined"
+    max_context_length: 8000      # Maximum total context length (characters)
+    knowledge_weight: 0.3         # Knowledge context weight (0.0 to 1.0, higher = more knowledge influence)
+    integration_method: "system_prompt" # Integration method: "system_prompt" or "user_message"
+
+  # Performance and reliability settings
+  performance:
+    # Enable caching of embeddings and search results
+    enable_caching: true
+    # Cache directory (relative to project root)
+    cache_dir: "cache/knowledge"
+    # Cache expiration time in hours (0 = never expire)
+    cache_expiry_hours: 24
+    # Maximum memory usage for embeddings (MB)
+    max_memory_mb: 512
+    # Enable lazy loading of embeddings
+    lazy_loading: true
+
+  # Fallback options for reliability
+  fallback:
+    # Continue processing if knowledge base fails to load
+    graceful_degradation: true
+    # Use simple keyword matching if embedding model fails
+    use_keyword_fallback: true
+    # Log errors but don't fail the entire process
+    log_errors_only: true
+
 # Example System Prompt - Replace with your own, although this one is pretty good.
 system_prompt: |
   You are a speech-aware GenAI expert who specializes in generating natural-sounding transcripts for human narration and text-to-speech systems.

二進制
end-to-end-use-cases/powerpoint-to-voiceover-transcript/knowledge_base/.faiss_cache/chunks.pkl


二進制
end-to-end-use-cases/powerpoint-to-voiceover-transcript/knowledge_base/.faiss_cache/faiss.index


+ 13 - 0
end-to-end-use-cases/powerpoint-to-voiceover-transcript/knowledge_base/.faiss_cache/metadata.json

@@ -0,0 +1,13 @@
+{
+  "knowledge_base_hash": "85da6d7febd5f515bfe2bc45a3fa76c1",
+  "index_type": "flat",
+  "embedding_model": "all-MiniLM-L6-v2",
+  "total_chunks": 19,
+  "created_at": "2025-08-08T12:24:57.718282",
+  "stats": {
+    "total_searches": 0,
+    "cache_hits": 0,
+    "index_builds": 1,
+    "last_updated": "2025-08-08T12:24:57.695935"
+  }
+}

+ 145 - 0
end-to-end-use-cases/powerpoint-to-voiceover-transcript/knowledge_base/llama diet.md

@@ -0,0 +1,145 @@
+🦙 Llama Diet and Digestion
+Llamas (Lama glama), as members of the camelid family, are highly adapted herbivores with efficient digestive systems that enable them to survive in nutrient-sparse environments such as the high Andes. Their diet is composed primarily of fibrous plant material, and their specialized digestive system is evolved to extract maximum nutrients from minimal input.
+
+🌱 Natural Diet
+In their native Andean habitat, llamas are browsers and grazers that consume a diverse range of plant matter. Their diet varies depending on altitude, season, and availability.
+
+Primary Components:
+Grasses: The bulk of their diet, especially puna grasses in the Andes.
+
+Shrubs and herbs: Including low-lying woody plants and succulents.
+
+Forbs: Broad-leaved herbaceous plants.
+
+Lichens and mosses: Occasionally consumed in high-altitude regions.
+
+In Captivity or Managed Settings:
+Grass hay: The main source of fiber (e.g., timothy, orchard grass, brome).
+
+Legume hay (e.g., alfalfa): High in protein but given in moderation due to excess calcium and energy.
+
+Grain concentrates: Provided sparingly, typically during pregnancy, lactation, or recovery from illness.
+
+Fresh pasture: When available, llamas graze like sheep or goats.
+
+Supplements: Mineral blocks or loose minerals to support trace element intake (especially selenium, copper, and zinc in deficient regions).
+
+Llamas are efficient foragers, able to sustain themselves on marginal land where other livestock might fail. They prefer to browse selectively and will often avoid contaminated or spoiled feed.
+
+🧠 Feeding Behavior
+Llamas exhibit intelligent and selective feeding behavior:
+
+Diurnal feeding: Most active during early morning and late afternoon.
+
+Slow, deliberate grazers: They nibble plants rather than tearing or uprooting them.
+
+Low water requirement: Llamas can go long periods without drinking, deriving moisture from forage.
+
+They also have a strong memory for feeding areas and can adapt quickly to rotational grazing practices, which makes them relatively low-maintenance compared to cattle or horses.
+
+🧪 Digestive System
+Llamas are pseudoruminants, meaning they have a three-compartment stomach (as opposed to the four compartments found in true ruminants like cows). These compartments are:
+
+C1 (Compartment 1):
+
+Analogous to the rumen.
+
+The largest compartment.
+
+Hosts a diverse microbial population (bacteria, protozoa, fungi) that ferments fibrous plant material.
+
+Responsible for volatile fatty acid (VFA) production, a major energy source.
+
+C2 (Compartment 2):
+
+Functions in further fermentation and nutrient absorption.
+
+Works closely with C1 to maximize microbial breakdown of cellulose.
+
+C3 (Compartment 3):
+
+Equivalent to the abomasum in true ruminants (the “true stomach”).
+
+Secretes digestive enzymes (HCl, pepsin) for acidic digestion of microbial proteins and residual carbohydrates.
+
+The distal end of C3 is highly acidic and prone to ulcers if under stress or dietary imbalance.
+
+Key Features:
+Regurgitation and remastication: Like ruminants, llamas chew cud to further break down fibrous material.
+
+Microbial symbiosis: Microbes digest cellulose and hemicellulose into VFAs like acetate, propionate, and butyrate.
+
+Long retention time: Slow digestion allows for high fiber digestibility (>50% in some cases).
+
+Efficient nitrogen recycling: Llamas are able to conserve nitrogen via urea recycling into the digestive tract.
+
+💩 Waste Output and Nutrient Cycling
+Llamas produce small, dry fecal pellets that are:
+
+Low in moisture
+
+Rich in partially digested fiber
+
+Valuable as fertilizer ("llama beans") due to slow nitrogen release and low odor
+
+They often defecate in communal dung piles, a behavior that:
+
+Minimizes parasite transmission
+
+Helps with territory marking
+
+Makes pasture cleanup easier
+
+⚠️ Dietary Issues and Management
+While llamas are hardy, their diet must be managed to avoid health issues:
+
+Common Problems:
+Issue	Cause	Prevention
+Obesity	Overfeeding grain, lush pasture	Monitor body condition, restrict energy-dense feeds
+Protein Deficiency	Low-quality forage	Supplement with legume hay or protein concentrates
+Mineral Deficiency	Selenium, copper, or zinc lack	Provide species-specific mineral supplements
+Ulcers in C3	Stress, abrupt dietary change	Ensure consistent feeding, reduce stress, avoid NSAIDs
+Bloat (rare)	Excess legumes or lush pasture	Limit high-risk feeds, encourage slow transition
+
+Special Diets:
+Pregnant/lactating females: Require higher protein and energy levels.
+
+Working llamas: May need additional energy (carbohydrates) for endurance.
+
+Older llamas: Benefit from easy-to-chew forage and soaked hay cubes.
+
+🧮 Nutritional Requirements
+Approximate daily requirements (adult llama, maintenance level):
+
+Nutrient	Requirement
+Dry matter intake	1.8–2.5% of body weight
+Crude protein	8–10% (12–16% for growth or lactation)
+Calcium:Phosphorus ratio	1.5–2:1
+Salt	0.25–0.5 oz/day
+Water	2–5 gallons/day (varies by temperature and diet)
+
+These values should be adjusted for workload, age, reproductive status, and environment.
+
+🧬 Evolutionary Adaptations
+Llamas evolved in high-altitude, arid environments, leading to:
+
+Low metabolic requirements
+
+High fiber digestion efficiency
+
+Ability to thrive on poor forage
+
+Adaptation to wide dietary variability
+
+These traits make llamas an eco-efficient livestock option in areas where conventional livestock may not be viable.
+
+📚 References
+Fowler, M.E. (1998). Medicine and Surgery of South American Camelids.
+
+Van Saun, R.J. (2006). Nutritional Requirements and Feeding of Llamas and Alpacas.
+
+NRC (2007). Nutrient Requirements of Small Ruminants.
+
+Camelid Nutrition Council: www.camelidnutrition.org
+
+Oregon State University Extension Service

+ 208 - 0
end-to-end-use-cases/powerpoint-to-voiceover-transcript/knowledge_base/llamas.md

@@ -0,0 +1,208 @@
+# Llamas (*Lama glama*)
+
+Llamas are domesticated South American camelids, widely used as meat and pack animals by Andean cultures since the Pre-Columbian era. They are members of the biological family Camelidae, which includes camels, alpacas, guanacos, and vicuñas. Llamas are closely related to alpacas but are larger and typically used for different purposes.
+
+---
+
+## Etymology
+
+The name "llama" (pronounced \[ˈja.ma] in Spanish) is derived from the native Quechua word "lama" or "llama". The word was adopted by Spanish settlers and integrated into English and other languages.
+
+---
+
+## Taxonomy
+
+* **Kingdom**: Animalia
+* **Phylum**: Chordata
+* **Class**: Mammalia
+* **Order**: Artiodactyla
+* **Family**: Camelidae
+* **Genus**: *Lama*
+* **Species**: *Lama glama*
+
+Llamas are part of the genus *Lama*, which also includes alpacas (*Lama pacos*), guanacos (*Lama guanicoe*), and the extinct species *Lama owenii*. They are believed to be domesticated from wild guanacos around 4,000 to 5,000 years ago in the Andean highlands.
+
+---
+
+## Physical Description
+
+| Feature  | Description                                                                                      |
+| -------- | ------------------------------------------------------------------------------------------------ |
+| Height   | 1.7 to 1.8 meters (5.5 to 6 ft) at the head                                                      |
+| Weight   | 130 to 200 kg (290 to 440 lb)                                                                    |
+| Coat     | Long, soft wool in various natural colors: white, black, brown, gray, and patterned combinations |
+| Lifespan | 15 to 25 years                                                                                   |
+| Ears     | Long and banana-shaped, curving inward                                                           |
+| Feet     | Two-toed with soft pads for gripping rocky terrain                                               |
+
+---
+
+## Distribution and Habitat
+
+Llamas are native to the **Andes Mountains** in South America. Today, they are primarily found in:
+
+* **Peru**
+* **Bolivia**
+* **Ecuador**
+* **Chile**
+* **Argentina**
+
+Due to their adaptability, llamas have also been introduced to North America, Europe, Australia, and New Zealand. They are well-suited to high-altitude environments but can thrive in various climates with appropriate care.
+
+---
+
+## Domestication and Historical Use
+
+Llamas were domesticated in the Andes around 4,000–5,000 years ago. They played a central role in the development of early Andean civilizations, including the Inca Empire. Uses included:
+
+* **Transport**: Llamas were the primary pack animals, capable of carrying 25–30% of their body weight.
+* **Fiber**: Their wool was woven into textiles, a crucial cultural and economic component of Andean society.
+* **Meat and leather**: Used for sustenance and clothing.
+* **Manure**: Used as fertilizer and fuel in high-altitude regions.
+
+Llama caravans were a vital part of the Incan road and trade networks.
+
+---
+
+## Behavior and Social Structure
+
+Llamas are social herd animals and form strong social bonds. They are generally gentle, curious, and intelligent. Key behavioral traits:
+
+* **Herd hierarchy**: Llamas live in hierarchical groups with an alpha male.
+* **Spitting**: Used primarily as a social signal among llamas, especially to establish dominance. Rarely directed at humans.
+* **Communication**: Includes humming, clucking, and alarm calls.
+* **Training**: Easily trainable; can learn simple commands and are used in therapy, trekking, and shows.
+
+---
+
+## Diet and Digestion
+
+Llamas are herbivorous grazers. Their natural diet includes:
+
+* Grasses
+* Shrubs
+* Lichens and mosses (in mountainous areas)
+* Hay and supplemental grains (in captivity)
+
+They have a **three-compartment stomach** that allows efficient digestion of roughage and fibrous plants. They chew cud like cattle.
+
+---
+
+## Reproduction and Lifespan
+
+* **Mating system**: Induced ovulators; breeding is polygynous in the wild.
+* **Gestation**: Approximately 11.5 months
+* **Offspring**: One baby, called a **cria**
+* **Weaning**: Around 4–6 months
+
+Llamas reach maturity at around 2–3 years of age. Under proper care, llamas can live up to 25 years.
+
+---
+
+## Llama Fiber and Its Uses
+
+Llama fiber is highly valued for its warmth, softness, and durability. It differs from alpaca fiber, which is finer and softer.
+
+### Characteristics of Llama Fiber
+
+* Hollow, insulating fibers
+* Lanolin-free (hypoallergenic)
+* Coarse guard hairs are typically removed in processing
+
+### Common Products
+
+* Blankets
+* Ropes
+* Rugs
+* Outerwear (ponchos, coats)
+
+---
+
+## Economic and Cultural Importance
+
+Llamas continue to serve a key economic role in rural Andean communities. They are used in:
+
+* **Agricultural labor**
+* **Eco-tourism and trekking**
+* **Cultural ceremonies and festivals**
+* **Wool and meat production**
+
+In modern contexts, llamas are used in North America and Europe for:
+
+* **Companionship and therapy**
+* **Guard animals for sheep**
+* **4H and agricultural education programs**
+
+---
+
+## Health and Care
+
+Llamas are hardy animals but require basic veterinary care, including:
+
+* **Vaccinations** (e.g., CDT: Clostridium perfringens types C and D and tetanus)
+* **Regular deworming**
+* **Shearing and toenail trimming**
+* **Adequate shelter and nutrition**
+
+They are generally disease-resistant but may be prone to parasites and heat stress in non-native environments.
+
+---
+
+## Llamas vs. Alpacas
+
+| Feature     | Llama (*Lama glama*) | Alpaca (*Lama pacos*) |
+| ----------- | -------------------- | --------------------- |
+| Size        | Larger (290–440 lb)  | Smaller (120–145 lb)  |
+| Ears        | Long, banana-shaped  | Short, spear-shaped   |
+| Fiber       | Coarser              | Softer, finer         |
+| Use         | Pack animal          | Fiber production      |
+| Temperament | More independent     | More docile           |
+
+---
+
+## Conservation Status
+
+Llamas are **not endangered**. They are widely bred and maintained both in South America and globally. However, genetic diversity is a concern in some isolated populations.
+
+Efforts exist to preserve native Andean breeds and to prevent crossbreeding that could lead to loss of local adaptations.
+
+---
+
+## Llamas in Popular Culture
+
+Llamas have become widely recognized in popular media. Examples include:
+
+* **Books**: *Llama Llama* children's series by Anna Dewdney
+* **Films**: *The Emperor's New Groove* (features a human transformed into a llama)
+* **Memes and merchandise**: Known for their expressive faces and quirky charm
+* **Emojis**: 🦙
+
+They are often used as symbols of uniqueness, calmness, and endurance.
+
+---
+
+## See Also
+
+* **Alpaca** (*Lama pacos*)
+* **Guanaco** (*Lama guanicoe*)
+* **Vicuña** (*Vicugna vicugna*)
+* **Inca Empire**
+* **South American camelids**
+
+---
+
+## References
+
+1. Fowler, M. E. (2010). *Medicine and Surgery of Camelids*. Wiley-Blackwell.
+2. Wheeler, J. C. (1995). Evolution and present situation of the South American Camelidae. *Biological Journal of the Linnean Society*, 54(3), 271-295.
+3. National Geographic. "Llamas: Profile and Facts."
+4. Smithsonian National Zoo. "Llama Profile."
+5. International Llama Association. [https://www.internationalllama.org/](https://www.internationalllama.org/)
+
+---
+
+## External Links
+
+* [International Llama Registry](https://www.lamaregistry.com/)
+* [North American Camelid Studies Program](https://www.camelidstudies.org/)
+* [American Llama Show Association](https://www.americanllamashows.com/)

+ 985 - 0
end-to-end-use-cases/powerpoint-to-voiceover-transcript/knowledge_enhanced_workflow.ipynb

@@ -0,0 +1,985 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "0e4aad87-ddd4-4b5e-a83f-63a75bd89f38",
+   "metadata": {},
+   "source": [
+    "# PowerPoint to Knowledge-Grounded & Narrative-Aware Voiceover Transcript Generator\n",
+    "\n",
+    "This cookbook demonstrates the complete workflow for converting PowerPoint presentations into AI-generated voiceover transcripts with retrieval augmentation and narrative continuity features, powered by Llama 4 Maverick's vision capabilities through the Llama API.\n",
+    "\n",
+    "## Overview\n",
+    "\n",
+    "This workflow performs the following operations:\n",
+    "\n",
+    "1. **Content Extraction**: Pulls speaker notes and visual elements from PowerPoint slides\n",
+    "2. **Knowledge Base Integration**: Leverages external knowledge sources to enhance transcript quality (For the purposes of this cookbook, the knowledge_base folder)\n",
+    "3. **Image Conversion**: Transforms slides into high-quality images for analysis by Llama 4 Maverick.\n",
+    "4. **Context-Aware Generation**: Creates natural-sounding voiceover content with narrative continuity and knowledge-based insights\n",
+    "    - **Speech Optimization**: Converts numbers, technical terms, and abbreviations to spoken form\n",
+    "6. **Results Export**: Saves transcripts, context information, and knowledge usage statistics in multiple formats\n",
+    "\n",
+    "## Key Features\n",
+    "\n",
+    "- **Knowledge Base Integration**: Automatically retrieves relevant information from markdown knowledge files\n",
+    "- **Unified Processor**: Single class handles both standard and narrative-aware processing with knowledge enhancement\n",
+    "- **Configurable Context**: Adjustable context window for narrative continuity and knowledge retrieval\n",
+    "- **Mode Selection**: Toggle between standard and narrative processing with optional knowledge integration\n",
+    "- **Performance Optimization**: Caching and lazy loading for efficient knowledge retrieval\n",
+    "\n",
+    "## Prerequisites\n",
+    "\n",
+    "Before running this notebook, ensure you have:\n",
+    "- Created a `.env` file with your `LLAMA_API_KEY`\n",
+    "- Updated `config.yaml` with your presentation file path\n",
+    "- Set up your knowledge base directory with relevant markdown files (This cookbook only supports markdown format at the moment)\n",
+    "- Enabled knowledge base features in `config.yaml` (set `knowledge.enabled: true`)\n",
+    "\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "b3367845-76ad-4493-a312-f80f00fad029",
+   "metadata": {},
+   "source": [
+    "\n",
+    "## Setup and Configuration\n",
+    "\n",
+    "Import required libraries and load environment configuration."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 50,
+   "id": "37249034-75bf-41bd-b640-eb6345435f47",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Import required libraries\n",
+    "import pandas as pd\n",
+    "import os\n",
+    "from pathlib import Path\n",
+    "from dotenv import load_dotenv\n",
+    "import matplotlib.pyplot as plt\n",
+    "from IPython.display import display"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 51,
+   "id": "0aedb2c5-5762-43ae-826b-fdb45ff642f5",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "SUCCESS: Environment loaded successfully!\n",
+      "SUCCESS: GROQ API key found\n"
+     ]
+    }
+   ],
+   "source": [
+    "# Load environment variables from .env file\n",
+    "load_dotenv()\n",
+    "\n",
+    "# Verify setup\n",
+    "if os.getenv('GROQ_API_KEY'):\n",
+    "    print(\"SUCCESS: Environment loaded successfully!\")\n",
+    "    print(\"SUCCESS: GROQ API key found\")\n",
+    "else:\n",
+    "    print(\"WARNING: GROQ_API_KEY not found in .env file\")\n",
+    "    print(\"Please check your .env file and add your API key\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 52,
+   "id": "0563bb13-9dbd-4a29-9b3b-f565befd2001",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "SUCCESS: All modules imported successfully!\n",
+      "- PPTX processor ready\n",
+      "- Unified transcript generator ready\n",
+      "- Configuration manager ready\n",
+      "- Visualization generator ready\n",
+      "- FAISS knowledge base components ready\n"
+     ]
+    }
+   ],
+   "source": [
+    "# Import custom modules\n",
+    "try:\n",
+    "    from src.core.pptx_processor import extract_pptx_notes, pptx_to_images_and_notes\n",
+    "    from src.processors.unified_transcript_generator import (\n",
+    "        UnifiedTranscriptProcessor,\n",
+    "        process_slides,\n",
+    "        process_slides_with_narrative\n",
+    "    )\n",
+    "    from src.config.settings import load_config, get_config, is_knowledge_enabled\n",
+    "    from src.utils.visualization import display_slide_grid, display_slide_preview\n",
+    "\n",
+    "    print(\"SUCCESS: All modules imported successfully!\")\n",
+    "    print(\"- PPTX processor ready\")\n",
+    "    print(\"- Unified transcript generator ready\")\n",
+    "    print(\"- Configuration manager ready\")\n",
+    "    print(\"- Visualization generator ready\")\n",
+    "\n",
+    "    # Try to import knowledge base modules\n",
+    "    knowledge_available = False\n",
+    "    try:\n",
+    "        from src.knowledge.faiss_knowledge import FAISSKnowledgeManager\n",
+    "        from src.knowledge.context_manager import ContextManager\n",
+    "        knowledge_available = True\n",
+    "        print(\"- FAISS knowledge base components ready\")\n",
+    "    except ImportError as e:\n",
+    "        print(f\"- WARNING: Knowledge base components not available: {e}\")\n",
+    "\n",
+    "except ImportError as e:\n",
+    "    print(f\"ERROR: Import error: {e}\")\n",
+    "    print(\"Make sure you're running from the project root directory\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 53,
+   "id": "cafe366c-3ec6-47c7-8e70-ed69e89ae137",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "\n",
+      "SUCCESS: Configuration loaded successfully!\n",
+      "\n",
+      "Current Settings:\n",
+      "- Llama Model: meta-llama/llama-4-maverick-17b-128e-instruct\n",
+      "- Image DPI: 200\n",
+      "- Image Format: png\n",
+      "- Context Window: 5 previous slides (default)\n",
+      "- Knowledge Base: ENABLED\n",
+      "  - Knowledge Directory: knowledge_base\n",
+      "  - Context Strategy: combined\n",
+      "  - Knowledge Weight: 0.3\n",
+      "  - Embedding Model: all-MiniLM-L6-v2\n"
+     ]
+    }
+   ],
+   "source": [
+    "# Load configuration\n",
+    "config = load_config()\n",
+    "print(\"\\nSUCCESS: Configuration loaded successfully!\")\n",
+    "print(\"\\nCurrent Settings:\")\n",
+    "print(f\"- Llama Model: {config['api']['groq_model']}\")\n",
+    "print(f\"- Image DPI: {config['processing']['default_dpi']}\")\n",
+    "print(f\"- Image Format: {config['processing']['default_format']}\")\n",
+    "print(f\"- Context Window: 5 previous slides (default)\")\n",
+    "\n",
+    "# Display knowledge base configuration\n",
+    "knowledge_enabled = is_knowledge_enabled()\n",
+    "print(f\"- Knowledge Base: {'ENABLED' if knowledge_enabled else 'DISABLED'}\")\n",
+    "\n",
+    "if knowledge_enabled:\n",
+    "    knowledge_config = config.get('knowledge', {})\n",
+    "    print(f\"  - Knowledge Directory: {knowledge_config.get('knowledge_base_dir', 'knowledge_base')}\")\n",
+    "    print(f\"  - Context Strategy: {knowledge_config.get('context', {}).get('strategy', 'combined')}\")\n",
+    "    print(f\"  - Knowledge Weight: {knowledge_config.get('context', {}).get('knowledge_weight', 0.3)}\")\n",
+    "    print(f\"  - Embedding Model: {knowledge_config.get('embedding', {}).get('model_name', 'all-MiniLM-L6-v2')}\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "dd800f7d-3ae5-4291-89d4-32d5cfca6cc7",
+   "metadata": {},
+   "source": [
+    "#### Don't forget to update the config file with your pptx file name!\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 54,
+   "id": "58642e4d-cb6f-4e6f-8543-c1290a0e258d",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "File Configuration:\n",
+      "- Input File: input/All About Llamas.pptx\n",
+      "- Output Directory: output/\n",
+      "- SUCCESS: Input file found (10.8 MB)\n",
+      "- SUCCESS: Output directory ready\n"
+     ]
+    }
+   ],
+   "source": [
+    "# Configure file paths from config.yaml\n",
+    "pptx_file = config['current_project']['pptx_file'] + config['current_project']['extension']\n",
+    "output_dir = config['current_project']['output_dir']\n",
+    "\n",
+    "print(\"File Configuration:\")\n",
+    "print(f\"- Input File: {pptx_file}\")\n",
+    "print(f\"- Output Directory: {output_dir}\")\n",
+    "\n",
+    "# Verify input file exists\n",
+    "if Path(pptx_file).exists():\n",
+    "    file_size = Path(pptx_file).stat().st_size / 1024 / 1024\n",
+    "    print(f\"- SUCCESS: Input file found ({file_size:.1f} MB)\")\n",
+    "else:\n",
+    "    print(f\"- ERROR: Input file not found: {pptx_file}\")\n",
+    "    print(\"  Please update the 'pptx_file' path in config.yaml\")\n",
+    "\n",
+    "# Create output directory if needed\n",
+    "Path(output_dir).mkdir(parents=True, exist_ok=True)\n",
+    "print(f\"- SUCCESS: Output directory ready\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "09cf9962-a9f0-4362-a72b-7c11f50772bb",
+   "metadata": {},
+   "source": [
+    "## Knowledge Base Setup and Validation\n",
+    "\n",
+    "Set up and validate the knowledge base if enabled in configuration.\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 55,
+   "id": "e7666fa8-a4a4-4e7d-bf5d-e34ca992f9b0",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def setup_knowledge_base(config):\n",
+    "    \"\"\"Setup and validate knowledge base if enabled.\"\"\"\n",
+    "    knowledge_enabled = is_knowledge_enabled()\n",
+    "\n",
+    "    if not knowledge_enabled:\n",
+    "        print(\"Knowledge base is disabled in configuration\")\n",
+    "        return None, None\n",
+    "\n",
+    "    if not knowledge_available:\n",
+    "        print(\"WARNING: Knowledge base is enabled but components are not available\")\n",
+    "        return None, None\n",
+    "\n",
+    "    print(\"Setting up knowledge base...\")\n",
+    "\n",
+    "    knowledge_config = config.get('knowledge', {})\n",
+    "    knowledge_base_dir = Path(knowledge_config.get('knowledge_base_dir', 'knowledge_base'))\n",
+    "\n",
+    "    # Check if knowledge base directory exists and has content\n",
+    "    if not knowledge_base_dir.exists():\n",
+    "        print(f\"Creating knowledge base directory: {knowledge_base_dir}\")\n",
+    "        knowledge_base_dir.mkdir(parents=True, exist_ok=True)\n",
+    "\n",
+    "        # Create sample knowledge base files for demonstration\n",
+    "        create_sample_knowledge_base(knowledge_base_dir)\n",
+    "\n",
+    "    # List existing knowledge files\n",
+    "    md_files = list(knowledge_base_dir.rglob(\"*.md\"))\n",
+    "\n",
+    "    print(f\"Knowledge Base Status:\")\n",
+    "    print(f\"- Directory: {knowledge_base_dir}\")\n",
+    "    print(f\"- Markdown files found: {len(md_files)}\")\n",
+    "\n",
+    "    if md_files:\n",
+    "        print(\"- Available knowledge files:\")\n",
+    "        for md_file in md_files:\n",
+    "            file_size = md_file.stat().st_size\n",
+    "            print(f\"  - {md_file.name} ({file_size} bytes)\")\n",
+    "    else:\n",
+    "        print(\"- No knowledge files found\")\n",
+    "        print(\"- Creating sample knowledge base for demonstration...\")\n",
+    "        create_sample_knowledge_base(knowledge_base_dir)\n",
+    "        md_files = list(knowledge_base_dir.rglob(\"*.md\"))\n",
+    "        print(f\"- Created {len(md_files)} sample knowledge files\")\n",
+    "\n",
+    "    # Initialize knowledge manager\n",
+    "    try:\n",
+    "        # Get FAISS configuration from config\n",
+    "        vector_config = knowledge_config.get('vector_store', {})\n",
+    "        embedding_config = knowledge_config.get('embedding', {})\n",
+    "\n",
+    "        # Initialize FAISS knowledge manager with configuration\n",
+    "        knowledge_manager = FAISSKnowledgeManager(\n",
+    "            knowledge_base_dir=str(knowledge_base_dir),\n",
+    "            index_type=vector_config.get('index_type', 'flat'),\n",
+    "            embedding_model=embedding_config.get('model_name', 'all-MiniLM-L6-v2'),\n",
+    "            use_gpu=vector_config.get('use_gpu', False)\n",
+    "        )\n",
+    "        knowledge_manager.initialize()\n",
+    "\n",
+    "        context_manager = ContextManager()\n",
+    "\n",
+    "        # Display knowledge base statistics\n",
+    "        stats = knowledge_manager.get_stats()\n",
+    "        print(f\"- Knowledge chunks loaded: {stats['total_chunks']}\")\n",
+    "        print(f\"- Index type: {stats['index_type']}\")\n",
+    "        print(f\"- Embedding model: {stats['embedding_model']}\")\n",
+    "        print(f\"- Model loaded: {stats['model_loaded']}\")\n",
+    "        print(f\"- Index loaded: {stats['index_loaded']}\")\n",
+    "\n",
+    "        return knowledge_manager, context_manager\n",
+    "\n",
+    "    except Exception as e:\n",
+    "        print(f\"ERROR: Failed to initialize knowledge base: {e}\")\n",
+    "        import traceback\n",
+    "        traceback.print_exc()\n",
+    "        return None, None\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 56,
+   "id": "91f8fd6d-c142-4eb8-a72d-6640a7423af8",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Setting up knowledge base...\n",
+      "Knowledge Base Status:\n",
+      "- Directory: knowledge_base\n",
+      "- Markdown files found: 2\n",
+      "- Available knowledge files:\n",
+      "  - llama diet.md (5762 bytes)\n",
+      "  - llamas.md (7567 bytes)\n",
+      "- Knowledge chunks loaded: 19\n",
+      "- Index type: flat\n",
+      "- Embedding model: all-MiniLM-L6-v2\n",
+      "- Model loaded: True\n",
+      "- Index loaded: True\n"
+     ]
+    }
+   ],
+   "source": [
+    "# Setup knowledge base\n",
+    "knowledge_manager, context_manager = setup_knowledge_base(config)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "85c830ee-c91f-452b-987e-1652efeb326a",
+   "metadata": {},
+   "source": [
+    "## Processing Mode Configuration\n",
+    "\n",
+    "Choose your processing mode and configure the processor with knowledge integration.\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 57,
+   "id": "290d9c7e-19db-44e0-b9c3-8973674b1010",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Processing Mode Configuration:\n",
+      "- Mode: NARRATIVE CONTINUITY\n",
+      "- Context Window: 5 previous slides\n",
+      "- Knowledge Integration: ENABLED\n",
+      "  - Knowledge chunks available: 19\n",
+      "  - Search strategy: combined\n"
+     ]
+    }
+   ],
+   "source": [
+    "# Configure processing mode with knowledge integration\n",
+    "\n",
+    "USE_NARRATIVE = True  # Set to False for standard processing, True for narrative continuity\n",
+    "CONTEXT_WINDOW_SIZE = 5  # Number of previous slides to use as context (only used when USE_NARRATIVE=True)\n",
+    "ENABLE_KNOWLEDGE = True  # Set to False to disable knowledge base integration\n",
+    "\n",
+    "print(\"Processing Mode Configuration:\")\n",
+    "if USE_NARRATIVE:\n",
+    "    print(f\"- Mode: NARRATIVE CONTINUITY\")\n",
+    "    print(f\"- Context Window: {CONTEXT_WINDOW_SIZE} previous slides\")\n",
+    "else:\n",
+    "    print(f\"- Mode: STANDARD PROCESSING\")\n",
+    "    print(f\"- Features: Independent slide processing, faster execution\")\n",
+    "\n",
+    "print(f\"- Knowledge Integration: {'ENABLED' if ENABLE_KNOWLEDGE else 'DISABLED'}\")\n",
+    "\n",
+    "if ENABLE_KNOWLEDGE and knowledge_manager:\n",
+    "    print(f\"  - Knowledge chunks available: {knowledge_manager.get_stats()['total_chunks']}\")\n",
+    "    print(f\"  - Search strategy: {config.get('knowledge', {}).get('context', {}).get('strategy', 'combined')}\")\n",
+    "\n",
+    "# Initialize the unified processor with knowledge integration\n",
+    "processor = UnifiedTranscriptProcessor(\n",
+    "    use_narrative=USE_NARRATIVE,\n",
+    "    context_window_size=CONTEXT_WINDOW_SIZE,\n",
+    "    enable_knowledge=ENABLE_KNOWLEDGE\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "2cd7bd6d-364a-4350-9f38-b988323fcdae",
+   "metadata": {},
+   "source": [
+    "## Processing Pipeline\n",
+    "\n",
+    "Execute the main processing pipeline in three key steps.\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "1ce1e223-faf0-4ab3-996d-a451bed30fc9",
+   "metadata": {},
+   "source": [
+    "### Step 1: Extract Content and Convert to Images\n",
+    "\n",
+    "Extract speaker notes and slide text, then convert the presentation to high-quality images for AI analysis.\n",
+    "\n",
+    "\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "db3ad12e-03d8-45cb-9999-b167d2ab93c5",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "PROCESSING: Converting PPTX to images and extracting notes...\n",
+      "Processing: All About Llamas.pptx\n",
+      "Extracting speaker notes...\n",
+      "Found notes on 10 of 10 slides\n",
+      "Notes df saved to: /Users/yucedincer/Desktop/Projects/llama-cookbook/end-to-end-use-cases/powerpoint-to-voiceover-transcript/output/All About Llamas_notes.csv\n",
+      "Converting to PDF...\n"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...\n",
+      "To disable this warning, you can either:\n",
+      "\t- Avoid using `tokenizers` before the fork if possible\n",
+      "\t- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)\n"
+     ]
+    }
+   ],
+   "source": [
+    "print(\"PROCESSING: Converting PPTX to images and extracting notes...\")\n",
+    "\n",
+    "result = pptx_to_images_and_notes(\n",
+    "    pptx_path=pptx_file,\n",
+    "    output_dir=output_dir,\n",
+    "    extract_notes=True\n",
+    ")\n",
+    "\n",
+    "notes_df = result['notes_df']\n",
+    "image_files = result['image_files']\n",
+    "\n",
+    "print(f\"\\nSUCCESS: Processing completed successfully!\")\n",
+    "print(f\"- Processed {len(image_files)} slides\")\n",
+    "print(f\"- Images saved to: {result['output_dir']}\")\n",
+    "print(f\"- Found notes on {notes_df['has_notes'].sum()} slides\")\n",
+    "print(f\"- DataFrame shape: {notes_df.shape}\")\n",
+    "\n",
+    "# Show sample data\n",
+    "print(\"\\nSample Data (First 5 slides):\")\n",
+    "display(notes_df[['slide_number', 'slide_title', 'has_notes', 'notes_word_count', 'slide_text_word_count']].head())\n",
+    "\n",
+    "# Preview only the first 6 slide images\n",
+    "display_slide_preview(image_files, num_slides=6, max_cols=3)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "bf5e8a23-c046-45f5-a7cd-14baa70854c2",
+   "metadata": {},
+   "source": [
+    "### Step 2: Generate Knowledge-Enhanced Narrative-Aware AI Transcripts\n",
+    "\n",
+    "Use the Llama vision model to analyze each slide image and generate natural-sounding voiceover transcripts with both narrative continuity and knowledge base enhancement.\n",
+    "\n",
+    "This enhanced process:\n",
+    "- Analyzes slide visual content using AI vision\n",
+    "- Retrieves relevant information from the knowledge base\n",
+    "- Uses transcripts from previous slides as context\n",
+    "- Combines slide content, speaker notes, and knowledge insights\n",
+    "- Generates speech-optimized transcripts with smooth transitions and enhanced accuracy\n",
+    "- Maintains consistent terminology throughout the presentation\n",
+    "- Converts numbers and technical terms to spoken form\n",
+    "\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "2c56a543-4ad7-4276-99d2-0be5c198782c",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "print(\"PROCESSING: Starting AI transcript generation with knowledge-enhanced unified processor...\")\n",
+    "print(f\"- Processing {len(notes_df)} slides\")\n",
+    "print(f\"- Using model: {config['api']['groq_model']}\")\n",
+    "print(f\"- Mode: {'Narrative Continuity' if USE_NARRATIVE else 'Standard Processing'}\")\n",
+    "print(f\"- Knowledge Integration: {'ENABLED' if ENABLE_KNOWLEDGE else 'DISABLED'}\")\n",
+    "\n",
+    "if USE_NARRATIVE:\n",
+    "    print(f\"- Context window: {CONTEXT_WINDOW_SIZE} previous slides\")\n",
+    "    print(f\"- Using previous transcripts as context for narrative continuity\")\n",
+    "\n",
+    "if ENABLE_KNOWLEDGE and knowledge_manager:\n",
+    "    print(f\"- Knowledge base: {knowledge_manager.get_stats()['total_chunks']} chunks available\")\n",
+    "    print(f\"- Search strategy: {config.get('knowledge', {}).get('context', {}).get('strategy', 'combined')}\")\n",
+    "\n",
+    "print(\"- This may take several minutes...\")\n",
+    "\n",
+    "# Generate transcripts using the knowledge-enhanced unified processor\n",
+    "processed_df = processor.process_slides_dataframe(\n",
+    "    df=notes_df,\n",
+    "    output_dir=output_dir,\n",
+    "    save_context=True  # Only saves context if USE_NARRATIVE=True\n",
+    ")\n",
+    "\n",
+    "print(f\"\\nSUCCESS: Transcript generation completed!\")\n",
+    "print(f\"- Generated {len(processed_df)} transcripts\")\n",
+    "print(f\"- Average length: {processed_df['ai_transcript'].str.len().mean():.0f} characters\")\n",
+    "print(f\"- Total words: {processed_df['ai_transcript'].str.split().str.len().sum():,}\")\n",
+    "\n",
+    "if USE_NARRATIVE:\n",
+    "    print(f\"- Context information saved to: {output_dir}narrative_context/\")\n",
+    "    print(f\"- Average context slides used: {processed_df['context_slides_used'].mean():.1f}\")\n",
+    "\n",
+    "if ENABLE_KNOWLEDGE and knowledge_manager:\n",
+    "    print(f\"- Knowledge base integration: Active during processing\")\n",
+    "    print(f\"- Enhanced transcripts with domain-specific information\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "2cd0590b-66af-4653-a3e1-5d4eb9a845af",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Show first 5 transcripts with detailed knowledge information\n",
+    "\n",
+    "from src.utils.transcript_display import show_transcripts_with_knowledge\n",
+    "show_transcripts_with_knowledge(processed_df, knowledge_manager, num_slides=5)\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "e2d617a6-33c3-4747-86d3-5a4161aa857c",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "b4dd6593-6539-4d4f-baa7-678fed43165d",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "207e169d-7668-4d75-b2d7-265504175ec7",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "25258057-dad3-4ced-adfd-8e399eb2bae6",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "markdown",
+   "id": "b6aec6d2-f001-46a7-bf5d-2e29318d5f82",
+   "metadata": {},
+   "source": [
+    "### Step 3: Save Results\n",
+    "\n",
+    "Save results in multiple formats with knowledge integration metadata.\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "ff2f8de2-121b-4e98-a426-80c37cb19da1",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "print(\"PROCESSING: Saving knowledge-enhanced results in multiple formats...\")\n",
+    "\n",
+    "# Create output directory\n",
+    "os.makedirs(output_dir, exist_ok=True)\n",
+    "\n",
+    "# Determine file prefix based on processing mode and knowledge integration\n",
+    "mode_prefix = \"narrative\" if USE_NARRATIVE else \"standard\"\n",
+    "knowledge_prefix = \"knowledge_enhanced\" if ENABLE_KNOWLEDGE else \"standard\"\n",
+    "file_prefix = f\"{knowledge_prefix}_{mode_prefix}\"\n",
+    "\n",
+    "# Save complete results with all metadata\n",
+    "output_file = f\"{output_dir}{file_prefix}_transcripts.csv\"\n",
+    "processed_df.to_csv(output_file, index=False)\n",
+    "print(f\"- SUCCESS: Complete results saved to {output_file}\")\n",
+    "\n",
+    "# Save transcript-only version for voiceover work\n",
+    "if USE_NARRATIVE:\n",
+    "    transcript_only = processed_df[['slide_number', 'slide_title', 'ai_transcript', 'context_slides_used']]\n",
+    "else:\n",
+    "    transcript_only = processed_df[['slide_number', 'slide_title', 'ai_transcript']]\n",
+    "\n",
+    "transcript_file = f\"{output_dir}{file_prefix}_transcripts_clean.csv\"\n",
+    "transcript_only.to_csv(transcript_file, index=False)\n",
+    "print(f\"- SUCCESS: Clean transcripts saved to {transcript_file}\")\n",
+    "\n",
+    "# Save as JSON for API integration\n",
+    "json_file = f\"{output_dir}{file_prefix}_transcripts.json\"\n",
+    "processed_df.to_json(json_file, orient='records', indent=2)\n",
+    "print(f\"- SUCCESS: JSON format saved to {json_file}\")\n",
+    "\n",
+    "# Save knowledge base statistics if available\n",
+    "if ENABLE_KNOWLEDGE and knowledge_manager:\n",
+    "    knowledge_stats_file = f\"{output_dir}knowledge_base_stats.json\"\n",
+    "    stats = knowledge_manager.get_stats()\n",
+    "\n",
+    "    import json\n",
+    "    with open(knowledge_stats_file, 'w') as f:\n",
+    "        json.dump(stats, f, indent=2)\n",
+    "    print(f\"- SUCCESS: Knowledge base statistics saved to {knowledge_stats_file}\")\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "d49d7cb9-e598-4511-875b-1629a4373a67",
+   "metadata": {},
+   "source": [
+    " "
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "4b2e1671-9495-45bb-9ac1-a02a83037eb5",
+   "metadata": {},
+   "source": [
+    "---\n",
+    "\n",
+    "# Completion Summary\n",
+    "\n",
+    "## Successfully Generated:\n",
+    "- **Knowledge-Enhanced Processing**: Integrated external knowledge base with transcript generation\n",
+    "- **Unified Processing**: Single processor handles standard, narrative, and knowledge-enhanced modes\n",
+    "- **Flexible Configuration**: Easy switching between processing modes and knowledge integration\n",
+    "- **Speech-Optimized Transcripts**: Natural-sounding voiceover content enhanced with domain knowledge\n",
+    "- **Multiple Formats**: CSV, JSON exports for different use cases\n",
+    "- **Context Analysis**: Detailed information about narrative flow and knowledge usage\n",
+    "- **Performance Optimization**: Efficient knowledge retrieval with caching and lazy loading\n",
+    "\n",
+    "## Output Files:\n",
+    "- `[knowledge_mode]_[narrative_mode]_transcripts.csv` - Complete dataset with metadata\n",
+    "- `[knowledge_mode]_[narrative_mode]_transcripts_clean.csv` - Clean transcripts for voiceover work\n",
+    "- `[knowledge_mode]_[narrative_mode]_transcripts.json` - JSON format for API integration\n",
+    "- `knowledge_base_stats.json` - Knowledge base usage statistics\n",
+    "- `narrative_context/` - Context analysis files (narrative mode only)\n",
+    "- Individual slide images in PNG/JPEG format\n",
+    "\n",
+    "## Processing Modes:\n",
+    "\n",
+    "### Standard Mode (`USE_NARRATIVE = False`, `ENABLE_KNOWLEDGE = False`)\n",
+    "- **Best for**: Simple presentations, quick processing, independent slides\n",
+    "- **Features**: Fast execution, no context dependencies\n",
+    "- **Use cases**: Training materials, product demos, standalone slides\n",
+    "\n",
+    "### Knowledge-Enhanced Standard Mode (`USE_NARRATIVE = False`, `ENABLE_KNOWLEDGE = True`)\n",
+    "- **Best for**: Technical presentations requiring domain expertise\n",
+    "- **Features**: Domain knowledge integration, improved accuracy\n",
+    "- **Use cases**: Technical documentation, educational materials, expert presentations\n",
+    "\n",
+    "### Narrative Mode (`USE_NARRATIVE = True`, `ENABLE_KNOWLEDGE = False`)\n",
+    "- **Best for**: Story-driven presentations, complex topics, educational content\n",
+    "- **Features**: Context awareness, smooth transitions, terminology consistency\n",
+    "- **Use cases**: Conference talks, educational courses, marketing presentations\n",
+    "\n",
+    "### Knowledge-Enhanced Narrative Mode (`USE_NARRATIVE = True`, `ENABLE_KNOWLEDGE = True`)\n",
+    "- **Best for**: Complex educational content requiring both continuity and expertise\n",
+    "- **Features**: Full context awareness, domain knowledge, smooth transitions, enhanced accuracy\n",
+    "- **Use cases**: Advanced training, academic presentations, expert-level educational content\n",
+    "\n",
+    "## Knowledge Base Features:\n",
+    "\n",
+    "### Automatic Knowledge Retrieval\n",
+    "- **Semantic Search**: Uses embedding models to find relevant knowledge chunks\n",
+    "- **Context Integration**: Seamlessly blends knowledge with slide content and speaker notes\n",
+    "- **Fallback Mechanisms**: Graceful degradation if knowledge components fail\n",
+    "\n",
+    "### Performance Optimization\n",
+    "- **Caching**: Stores embeddings and search results for faster processing\n",
+    "- **Lazy Loading**: Loads knowledge components only when needed\n",
+    "- **Memory Management**: Efficient memory usage with configurable limits\n",
+    "\n",
+    "### Configuration Options\n",
+    "- **Search Strategy**: Choose between knowledge-only, narrative-priority, or combined approaches\n",
+    "- **Knowledge Weight**: Adjust the influence of knowledge base content\n",
+    "- **Similarity Threshold**: Control the relevance threshold for knowledge retrieval\n",
+    "\n",
+    "## Next Steps:\n",
+    "1. **Review** generated transcripts for accuracy, flow, and knowledge integration quality\n",
+    "2. **Customize** knowledge base with domain-specific content for your presentations\n",
+    "3. **Tune** knowledge integration parameters for optimal results\n",
+    "4. **Edit** any content that needs refinement\n",
+    "5. **Create** voiceover recordings or use TTS systems\n",
+    "6. **Integrate** JSON data into your video production workflow\n",
+    "7. **Experiment** with different processing modes and knowledge settings\n",
+    "\n",
+    "## Tips for Better Results:\n",
+    "\n",
+    "### Knowledge Base Optimization\n",
+    "- **Rich Content**: Include comprehensive, well-structured markdown files in your knowledge base\n",
+    "- **Relevant Topics**: Ensure knowledge base content aligns with your presentation topics\n",
+    "- **Clear Structure**: Use proper markdown headers and sections for better chunk extraction\n",
+    "- **Regular Updates**: Keep knowledge base content current and accurate\n",
+    "\n",
+    "### Processing Mode Selection\n",
+    "- **Simple Presentations**: Use standard mode for quick, independent slide processing\n",
+    "- **Technical Content**: Enable knowledge integration for domain-specific accuracy\n",
+    "- **Story-Driven Content**: Use narrative mode for presentations with logical flow\n",
+    "- **Complex Educational Material**: Combine both narrative and knowledge features\n",
+    "\n",
+    "### Configuration Tuning\n",
+    "- **Context Window**: Adjust context window size (3-7 slides) based on presentation complexity\n",
+    "- **Knowledge Weight**: Fine-tune knowledge influence (0.1-0.5) based on content needs\n",
+    "- **Search Parameters**: Adjust similarity threshold and top-k values for optimal knowledge retrieval\n",
+    "- **Consistent Style**: Maintain consistent formatting across your presentation\n",
+    "\n",
+    "### Performance Considerations\n",
+    "- **Memory Usage**: Monitor knowledge base memory consumption for large knowledge bases\n",
+    "- **Processing Time**: Knowledge integration adds processing time but improves quality\n",
+    "- **Caching**: Enable caching for repeated processing of the same presentations\n",
+    "- **Batch Processing**: Process multiple presentations efficiently with shared knowledge base\n",
+    "\n",
+    "---\n",
+    "\n",
+    "## Advanced Features\n",
+    "\n",
+    "### Custom Knowledge Base Creation\n",
+    "Create domain-specific knowledge bases by:\n",
+    "1. **Organizing Content**: Structure markdown files by topic, domain, or presentation type\n",
+    "2. **Using Headers**: Employ clear markdown headers for better chunk extraction\n",
+    "3. **Including Examples**: Add concrete examples and case studies\n",
+    "4. **Maintaining Quality**: Ensure accuracy and relevance of knowledge content\n",
+    "\n",
+    "### Integration with Existing Workflows\n",
+    "- **API Integration**: Use JSON output for seamless integration with video production tools\n",
+    "- **Batch Processing**: Process multiple presentations with shared knowledge bases\n",
+    "- **Custom Prompts**: Modify system prompts for specific use cases or audiences\n",
+    "- **Quality Assurance**: Implement review workflows for generated transcripts\n",
+    "\n",
+    "### Troubleshooting Common Issues\n",
+    "- **Knowledge Base Not Loading**: Check file paths and permissions\n",
+    "- **Poor Knowledge Retrieval**: Adjust similarity thresholds and search parameters\n",
+    "- **Memory Issues**: Reduce knowledge base size or enable lazy loading\n",
+    "- **Processing Errors**: Enable graceful degradation for robust processing\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "365e0d35-3a76-4dbf-b83f-5f43b7613e3b",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "fb21e0eb-6534-4a74-a8be-11585c5816ea",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "2f267a58-d43e-4a4c-96e9-7875754f1b80",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "b3a09ab5-3bff-49e0-8ad3-75dc8153a9f0",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "a964a651-8a63-4d20-bc9c-49a043a8cf64",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "ff14ceca-df4a-4ea5-8da1-468be8367e07",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "5778eee0-ee4f-421b-81a3-a74f13b43d36",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "787e632c-9866-4283-82b0-c26d101a1020",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "52aaaa68-1553-4bd6-b2bd-25604f4515c2",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "82861c16-c84e-401f-94a4-343512b29bf6",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "c464ba5b-8c55-45d3-be5c-2f2258ea327f",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "0af5695f-f436-4652-82bd-233b014d3a15",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "f29420c4-40c9-48e5-9331-e5b487fb084e",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "ed3bae75-4ed0-4742-81af-2b0f155de947",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "a24c87dd-97b7-434a-b669-471f5dc3af29",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "54a23f65-207d-476c-9aac-ea61f5af6804",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "e7493249-d0e4-4ef9-bdf4-fffea7f9de70",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "2596be10-0ff9-4440-9385-dd135fc3a633",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "01fd46be-46b9-4fcf-b4ed-2cc31f5fed07",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "a203ca3d-92ba-40f6-be14-fa1943531a3d",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "pptxTTS",
+   "language": "python",
+   "name": "pptxtts"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.13.2"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}

文件差異過大導致無法顯示
+ 140 - 29
end-to-end-use-cases/powerpoint-to-voiceover-transcript/narrative_continuity_workflow.ipynb


二進制
end-to-end-use-cases/powerpoint-to-voiceover-transcript/output/All About Llamas.pdf


+ 19 - 0
end-to-end-use-cases/powerpoint-to-voiceover-transcript/output/knowledge_base_stats.json

@@ -0,0 +1,19 @@
+{
+  "total_chunks": 19,
+  "index_type": "flat",
+  "embedding_model": "all-MiniLM-L6-v2",
+  "model_loaded": true,
+  "index_loaded": true,
+  "use_gpu": false,
+  "cache_dir": "knowledge_base/.faiss_cache",
+  "knowledge_base_dir": "knowledge_base",
+  "index_size": 19,
+  "dimension": 384,
+  "is_trained": true,
+  "total_searches": 5,
+  "cache_hits": 0,
+  "index_builds": 1,
+  "last_updated": "2025-08-08T12:24:57.695935",
+  "content_size_mb": 0.012198448181152344,
+  "avg_chunk_size": 673.2105263157895
+}

文件差異過大導致無法顯示
+ 44 - 0
end-to-end-use-cases/powerpoint-to-voiceover-transcript/output/knowledge_enhanced_narrative_transcripts.csv


文件差異過大導致無法顯示
+ 172 - 0
end-to-end-use-cases/powerpoint-to-voiceover-transcript/output/knowledge_enhanced_narrative_transcripts.json


文件差異過大導致無法顯示
+ 11 - 0
end-to-end-use-cases/powerpoint-to-voiceover-transcript/output/knowledge_enhanced_narrative_transcripts_clean.csv


文件差異過大導致無法顯示
+ 10 - 10
end-to-end-use-cases/powerpoint-to-voiceover-transcript/output/narrative_context/slide_contexts.json


文件差異過大導致無法顯示
+ 9 - 9
end-to-end-use-cases/powerpoint-to-voiceover-transcript/output/narrative_transcripts.csv


文件差異過大導致無法顯示
+ 9 - 9
end-to-end-use-cases/powerpoint-to-voiceover-transcript/output/narrative_transcripts.json


文件差異過大導致無法顯示
+ 9 - 9
end-to-end-use-cases/powerpoint-to-voiceover-transcript/output/narrative_transcripts_clean.csv


二進制
end-to-end-use-cases/powerpoint-to-voiceover-transcript/output/slide-001.png


二進制
end-to-end-use-cases/powerpoint-to-voiceover-transcript/output/slide-002.png


二進制
end-to-end-use-cases/powerpoint-to-voiceover-transcript/output/slide-003.png


二進制
end-to-end-use-cases/powerpoint-to-voiceover-transcript/output/slide-004.png


二進制
end-to-end-use-cases/powerpoint-to-voiceover-transcript/output/slide-006.png


二進制
end-to-end-use-cases/powerpoint-to-voiceover-transcript/output/slide-007.png


+ 16 - 1
end-to-end-use-cases/powerpoint-to-voiceover-transcript/pyproject.toml

@@ -11,7 +11,22 @@ dependencies = [
     "numpy>=2.3.2",
     "python-dotenv>=1.0.0",
     "tqdm>=4.66.0",
-    "llama-api-client>=0.1.0",
     "pyyaml>=6.0.0",
     "matplotlib>=3.10.5",
+    # Knowledge base integration dependencies
+    "sentence-transformers>=2.2.0,<3.0.0",
+    "jupyter>=1.1.1",
+    "ipykernel>=6.30.1",
+    "seaborn>=0.13.2",
+    "faiss-cpu>=1.11.0.post1",
+    "openpyxl>=3.1.5",
+    "groq>=0.31.0",
+]
+
+[project.optional-dependencies]
+dev = [
+    "pytest>=7.0.0",
+    "memory-profiler>=0.60.0",
+    "pytest-benchmark>=4.0.0",
+    "pytest-mock>=3.10.0",
 ]

+ 92 - 0
end-to-end-use-cases/powerpoint-to-voiceover-transcript/src/config/settings.py

@@ -145,3 +145,95 @@ def get_image_quality_config() -> Dict[str, Any]:
     """
     config = get_config()
     return config.get('image_quality', {})
+
+
+def get_knowledge_config() -> Dict[str, Any]:
+    """
+    Get knowledge base configuration settings.
+
+    Returns:
+        Dictionary containing knowledge base settings
+    """
+    config = get_config()
+    return config.get('knowledge', {})
+
+
+def is_knowledge_enabled() -> bool:
+    """
+    Check if knowledge base integration is enabled.
+
+    Returns:
+        True if knowledge base is enabled, False otherwise
+    """
+    knowledge_config = get_knowledge_config()
+    return knowledge_config.get('enabled', False)
+
+
+def validate_knowledge_config() -> None:
+    """
+    Validate knowledge base configuration parameters.
+
+    Raises:
+        ValueError: If configuration is invalid
+    """
+    from ..knowledge.exceptions import KnowledgeConfigurationError
+
+    if not is_knowledge_enabled():
+        return
+
+    knowledge_config = get_knowledge_config()
+
+    # Validate required fields
+    required_fields = ['knowledge_base_dir', 'embedding', 'search', 'context']
+    for field in required_fields:
+        if field not in knowledge_config:
+            raise KnowledgeConfigurationError(
+                f"Missing required knowledge configuration field: {field}",
+                config_key=field
+            )
+
+    # Validate embedding config
+    embedding_config = knowledge_config.get('embedding', {})
+    if 'model_name' not in embedding_config:
+        raise KnowledgeConfigurationError(
+            "Missing embedding model_name in knowledge configuration",
+            config_key='embedding.model_name'
+        )
+
+    # Validate search config
+    search_config = knowledge_config.get('search', {})
+    top_k = search_config.get('top_k', 5)
+    if not isinstance(top_k, int) or top_k <= 0:
+        raise KnowledgeConfigurationError(
+            f"Invalid top_k value: {top_k}. Must be a positive integer.",
+            config_key='search.top_k',
+            config_value=top_k
+        )
+
+    similarity_threshold = search_config.get('similarity_threshold', 0.3)
+    if not isinstance(similarity_threshold, (int, float)) or not 0.0 <= similarity_threshold <= 1.0:
+        raise KnowledgeConfigurationError(
+            f"Invalid similarity_threshold: {similarity_threshold}. Must be between 0.0 and 1.0.",
+            config_key='search.similarity_threshold',
+            config_value=similarity_threshold
+        )
+
+    # Validate context config
+    context_config = knowledge_config.get('context', {})
+    strategy = context_config.get('strategy', 'combined')
+    valid_strategies = ['knowledge_only', 'narrative_priority', 'combined']
+    if strategy not in valid_strategies:
+        raise KnowledgeConfigurationError(
+            f"Invalid context strategy: {strategy}. Must be one of {valid_strategies}.",
+            config_key='context.strategy',
+            config_value=strategy
+        )
+
+    integration_method = context_config.get('integration_method', 'system_prompt')
+    valid_methods = ['system_prompt', 'user_message']
+    if integration_method not in valid_methods:
+        raise KnowledgeConfigurationError(
+            f"Invalid integration method: {integration_method}. Must be one of {valid_methods}.",
+            config_key='context.integration_method',
+            config_value=integration_method
+        )

+ 2 - 2
end-to-end-use-cases/powerpoint-to-voiceover-transcript/src/core/__init__.py

@@ -2,13 +2,13 @@
 
 from .image_processing import encode_image
 from .pptx_processor import extract_pptx_notes, pptx_to_images_and_notes
-from .llama_client import LlamaClient
+from .groq_client import GroqClient
 from .file_utils import check_libreoffice
 
 __all__ = [
     "encode_image",
     "extract_pptx_notes",
     "pptx_to_images_and_notes",
-    "LlamaClient",
+    "GroqClient",
     "check_libreoffice"
 ]

+ 282 - 0
end-to-end-use-cases/powerpoint-to-voiceover-transcript/src/core/groq_client.py

@@ -0,0 +1,282 @@
+"""Groq API client wrapper for PPTX to Transcript."""
+
+from typing import Optional, Any, Union
+import os
+from .image_processing import encode_image
+from ..config.settings import get_api_config, get_system_prompt, is_knowledge_enabled, get_knowledge_config
+
+try:
+    from groq import Groq
+    GROQ_AVAILABLE = True
+except ImportError:
+    GROQ_AVAILABLE = False
+
+
+class GroqClient:
+    """Wrapper for Groq API client with configuration management."""
+
+    def __init__(self, api_key: Optional[str] = None, model: Optional[str] = None):
+        """
+        Initialize Groq client.
+
+        Args:
+            api_key: API key for Groq. If None, will be loaded from environment.
+            model: Model to use. If None, uses default vision model.
+        """
+        if not GROQ_AVAILABLE:
+            raise ImportError("groq package is required. Install with: pip install groq")
+
+        if api_key is None:
+            api_key = os.getenv('GROQ_API_KEY')
+
+        if not api_key:
+            raise ValueError("Groq API key not found. Set GROQ_API_KEY environment variable or provide api_key parameter.")
+
+        self.client = Groq(api_key=api_key)
+
+        # Groq vision models (update as new models become available)
+        self.model = model or "llama-3.2-11b-vision-preview"
+
+        # Groq API configuration
+        self.max_tokens = 4096
+        self.temperature = 0.1
+
+    def generate_transcript(
+        self,
+        image_path: str,
+        speaker_notes: str = "",
+        system_prompt: Optional[str] = None,
+        context_bundle: Optional[Any] = None,
+        stream: bool = False
+    ) -> Union[str, Any]:
+        """
+        Generate transcript from slide image and speaker notes with optional context.
+
+        Args:
+            image_path: Path to the slide image
+            speaker_notes: Speaker notes for the slide
+            system_prompt: Custom system prompt. If None, uses default from config.
+            context_bundle: ContextBundle for knowledge integration
+            stream: Whether to stream the response
+
+        Returns:
+            Generated transcript text if not streaming, otherwise the response object
+        """
+        if system_prompt is None:
+            system_prompt = get_system_prompt()
+
+        # Enhance with context if available
+        if context_bundle is not None and is_knowledge_enabled():
+            system_prompt, user_message_prefix = self._integrate_context(
+                system_prompt, context_bundle
+            )
+        else:
+            user_message_prefix = ""
+
+        encoded_image = encode_image(image_path)
+
+        # Build user message with optional context prefix
+        user_text = f"{user_message_prefix}Speaker Notes: {speaker_notes}".strip()
+
+        try:
+            response = self.client.chat.completions.create(
+                model=self.model,
+                messages=[
+                    {"role": "system", "content": system_prompt},
+                    {
+                        "role": "user",
+                        "content": [
+                            {
+                                "type": "text",
+                                "text": user_text,
+                            },
+                            {
+                                "type": "image_url",
+                                "image_url": {
+                                    "url": f"data:image/png;base64,{encoded_image}",
+                                },
+                            },
+                        ],
+                    },
+                ],
+                max_tokens=self.max_tokens,
+                temperature=self.temperature,
+                stream=stream,
+            )
+
+            if stream:
+                return response
+            else:
+                return response.choices[0].message.content
+
+        except Exception as e:
+            raise Exception(f"Groq API error: {str(e)}")
+
+    def _integrate_context(self, system_prompt: str, context_bundle: Any) -> tuple[str, str]:
+        """
+        Integrate context bundle into system prompt or user message.
+
+        Args:
+            system_prompt: Original system prompt
+            context_bundle: ContextBundle with context information
+
+        Returns:
+            Tuple of (enhanced_system_prompt, user_message_prefix)
+        """
+        try:
+            # Import here to avoid circular imports
+            from ..knowledge.context_manager import ContextManager
+
+            context_manager = ContextManager()
+            knowledge_config = get_knowledge_config()
+            integration_method = knowledge_config.get('context', {}).get('integration_method', 'system_prompt')
+
+            # Get formatted context
+            context_data = context_manager.get_context_for_integration(
+                context_bundle, integration_method
+            )
+
+            if not context_data:
+                return system_prompt, ""
+
+            if integration_method == "system_prompt":
+                return self._enhance_system_prompt(system_prompt, context_data), ""
+            elif integration_method == "user_message":
+                return system_prompt, self._enhance_user_message(context_data)
+            else:
+                return system_prompt, ""
+
+        except Exception as e:
+            # Graceful degradation - log error but continue without context
+            import logging
+            logger = logging.getLogger(__name__)
+            logger.warning(f"Failed to integrate context: {e}")
+            return system_prompt, ""
+
+    def _enhance_system_prompt(self, system_prompt: str, context_data: dict) -> str:
+        """
+        Enhance system prompt with context information.
+
+        Args:
+            system_prompt: Original system prompt
+            context_data: Context data from ContextManager
+
+        Returns:
+            Enhanced system prompt
+        """
+        context_addition = context_data.get('context_addition', '')
+        integration_point = context_data.get('integration_point', 'before_instructions')
+
+        if not context_addition:
+            return system_prompt
+
+        if integration_point == 'before_instructions':
+            # Add context before the main instructions
+            enhanced_prompt = f"{context_addition}\n\n{system_prompt}"
+        else:
+            # Default: append at the end
+            enhanced_prompt = f"{system_prompt}\n\n{context_addition}"
+
+        return enhanced_prompt
+
+    def _enhance_user_message(self, context_data: dict) -> str:
+        """
+        Create user message prefix with context information.
+
+        Args:
+            context_data: Context data from ContextManager
+
+        Returns:
+            User message prefix
+        """
+        context_addition = context_data.get('context_addition', '')
+
+        if context_addition:
+            return f"{context_addition}\n\n"
+
+        return ""
+
+    def run(
+        self,
+        image_path: str,
+        system_prompt: str,
+        user_prompt: str,
+        stream: bool = False
+    ) -> Union[str, Any]:
+        """
+        Legacy method for backward compatibility with notebook code.
+
+        Args:
+            image_path: Path to the image file
+            system_prompt: System prompt for the chat completion
+            user_prompt: User prompt (speaker notes)
+            stream: Whether to stream the response
+
+        Returns:
+            Response from the chat completion
+        """
+        encoded_image = encode_image(image_path)
+
+        try:
+            response = self.client.chat.completions.create(
+                model=self.model,
+                messages=[
+                    {"role": "system", "content": system_prompt},
+                    {
+                        "role": "user",
+                        "content": [
+                            {
+                                "type": "text",
+                                "text": f"Speaker Notes: {user_prompt}",
+                            },
+                            {
+                                "type": "image_url",
+                                "image_url": {
+                                    "url": f"data:image/png;base64,{encoded_image}",
+                                },
+                            },
+                        ],
+                    },
+                ],
+                max_tokens=self.max_tokens,
+                temperature=self.temperature,
+                stream=stream,
+            )
+
+            if stream:
+                for chunk in response:
+                    if chunk.choices[0].delta.content:
+                        print(chunk.choices[0].delta.content, end="", flush=True)
+            else:
+                return response
+
+        except Exception as e:
+            raise Exception(f"Groq API error: {str(e)}")
+
+    def list_models(self) -> list:
+        """
+        List available models from Groq.
+
+        Returns:
+            List of available models
+        """
+        try:
+            models = self.client.models.list()
+            return [model.id for model in models.data]
+        except Exception as e:
+            print(f"Error listing models: {e}")
+            return []
+
+    def get_model_info(self) -> dict:
+        """
+        Get information about the current model.
+
+        Returns:
+            Dictionary with model information
+        """
+        return {
+            'model': self.model,
+            'max_tokens': self.max_tokens,
+            'temperature': self.temperature,
+            'provider': 'Groq'
+        }

+ 0 - 130
end-to-end-use-cases/powerpoint-to-voiceover-transcript/src/core/llama_client.py

@@ -1,130 +0,0 @@
-"""Llama API client wrapper for PPTX to Transcript."""
-
-from typing import Optional, Any, Union
-from llama_api_client import LlamaAPIClient
-from .image_processing import encode_image
-from ..config.settings import get_api_config, get_system_prompt
-
-
-class LlamaClient:
-    """Wrapper for Llama API client with configuration management."""
-
-    def __init__(self, api_key: Optional[str] = None):
-        """
-        Initialize Llama client.
-
-        Args:
-            api_key: API key for Llama. If None, will be loaded from config/environment.
-        """
-        api_config = get_api_config()
-
-        if api_key is None:
-            api_key = api_config.get('llama_api_key')
-
-        if not api_key:
-            raise ValueError("Llama API key not found. Set LLAMA_API_KEY environment variable or provide api_key parameter.")
-
-        self.client = LlamaAPIClient(api_key=api_key)
-        self.model = api_config.get('llama_model', 'Llama-4-Maverick-17B-128E-Instruct-FP8')
-
-    def generate_transcript(
-        self,
-        image_path: str,
-        speaker_notes: str = "",
-        system_prompt: Optional[str] = None,
-        stream: bool = False
-    ) -> Union[str, Any]:
-        """
-        Generate transcript from slide image and speaker notes.
-
-        Args:
-            image_path: Path to the slide image
-            speaker_notes: Speaker notes for the slide
-            system_prompt: Custom system prompt. If None, uses default from config.
-            stream: Whether to stream the response
-
-        Returns:
-            Generated transcript text if not streaming, otherwise the response object
-        """
-        if system_prompt is None:
-            system_prompt = get_system_prompt()
-
-        encoded_image = encode_image(image_path)
-
-        response = self.client.chat.completions.create(
-            model=self.model,
-            messages=[
-                {"role": "system", "content": system_prompt},
-                {
-                    "role": "user",
-                    "content": [
-                        {
-                            "type": "text",
-                            "text": f"Speaker Notes: {speaker_notes}",
-                        },
-                        {
-                            "type": "image_url",
-                            "image_url": {
-                                "url": f"data:image/png;base64,{encoded_image}",
-                            },
-                        },
-                    ],
-                },
-            ],
-            stream=stream,
-        )
-
-        if stream:
-            return response
-        else:
-            return response.completion_message.content.text
-
-    def run(
-        self,
-        image_path: str,
-        system_prompt: str,
-        user_prompt: str,
-        stream: bool = False
-    ) -> Union[str, Any]:
-        """
-        Legacy method for backward compatibility with notebook code.
-
-        Args:
-            image_path: Path to the image file
-            system_prompt: System prompt for the chat completion
-            user_prompt: User prompt (speaker notes)
-            stream: Whether to stream the response
-
-        Returns:
-            Response from the chat completion
-        """
-        encoded_image = encode_image(image_path)
-
-        response = self.client.chat.completions.create(
-            model=self.model,
-            messages=[
-                {"role": "system", "content": system_prompt},
-                {
-                    "role": "user",
-                    "content": [
-                        {
-                            "type": "text",
-                            "text": f"Speaker Notes: {user_prompt}",
-                        },
-                        {
-                            "type": "image_url",
-                            "image_url": {
-                                "url": f"data:image/png;base64,{encoded_image}",
-                            },
-                        },
-                    ],
-                },
-            ],
-            stream=stream,
-        )
-
-        if stream:
-            for chunk in response:
-                print(chunk.event.delta.text, end="", flush=True)
-        else:
-            return response

+ 29 - 0
end-to-end-use-cases/powerpoint-to-voiceover-transcript/src/knowledge/__init__.py

@@ -0,0 +1,29 @@
+"""
+Knowledge base package for PowerPoint to Voiceover Transcript Generator.
+
+This package provides knowledge base integration capabilities using FAISS
+for efficient vector search and retrieval of domain-specific information.
+"""
+
+# Import main classes for easy access
+try:
+    from .faiss_knowledge import FAISSKnowledgeManager, KnowledgeChunk
+    from .context_manager import ContextManager, ContextBundle
+
+    # Backward compatibility
+    MarkdownKnowledgeManager = FAISSKnowledgeManager
+
+    __all__ = [
+        'FAISSKnowledgeManager',
+        'MarkdownKnowledgeManager',  # Backward compatibility alias
+        'KnowledgeChunk',
+        'ContextManager',
+        'ContextBundle'
+    ]
+
+except ImportError as e:
+    # Graceful degradation if dependencies are missing
+    import warnings
+    warnings.warn(f"Knowledge base components not fully available: {e}")
+
+    __all__ = []

+ 87 - 0
end-to-end-use-cases/powerpoint-to-voiceover-transcript/src/knowledge/context_manager.py

@@ -0,0 +1,87 @@
+from typing import List, Dict, Any, Optional
+from dataclasses import dataclass
+
+@dataclass
+class ContextBundle:
+    knowledge_context: str = ""
+    narrative_context: str = ""
+    combined_context: str = ""
+
+class ContextManager:
+    def __init__(self):
+        pass
+
+    def create_context_bundle(self, knowledge_chunks=None, narrative_context="", previous_slides=None) -> ContextBundle:
+        """Create a context bundle from knowledge chunks and narrative context"""
+        knowledge_context = ""
+
+        # Ensure narrative_context is not None
+        narrative_context = narrative_context or ""
+
+        if knowledge_chunks:
+            knowledge_context = "\n\n".join([
+                f"From {chunk.file_path}: {chunk.content[:200]}..."
+                for chunk in knowledge_chunks
+            ])
+
+        combined_context = f"{narrative_context}\n\n{knowledge_context}".strip()
+
+        return ContextBundle(
+            knowledge_context=knowledge_context,
+            narrative_context=narrative_context,
+            combined_context=combined_context
+        )
+    def get_context_stats(self, context_bundle: ContextBundle) -> Dict[str, Any]:
+        """Get statistics about the context bundle"""
+        return {
+            'knowledge_length': len(context_bundle.knowledge_context),
+            'narrative_length': len(context_bundle.narrative_context),
+            'combined_length': len(context_bundle.combined_context),
+            'total_words': len(context_bundle.combined_context.split())
+        }
+    def get_context_for_integration(self, context_bundle: ContextBundle, integration_method: str = "system_prompt") -> Dict[str, Any]:
+        """
+        Get context data formatted for integration into prompts.
+
+        Args:
+            context_bundle: The context bundle to format
+            integration_method: How to integrate context ("system_prompt" or "user_message")
+
+        Returns:
+            Dictionary with context_addition and integration_point
+        """
+        if not context_bundle:
+            return {}
+
+        # Safely get context strings, handling None values
+        combined_context = getattr(context_bundle, 'combined_context', '') or ''
+        knowledge_context = getattr(context_bundle, 'knowledge_context', '') or ''
+        narrative_context = getattr(context_bundle, 'narrative_context', '') or ''
+
+        if not combined_context.strip():
+            return {}
+
+        # Format the context for integration
+        context_parts = []
+
+        if knowledge_context.strip():
+            context_parts.append("## RELEVANT KNOWLEDGE\n\n" + knowledge_context)
+
+        if narrative_context.strip():
+            context_parts.append("## PREVIOUS SLIDES CONTEXT\n\n" + narrative_context)
+
+        if not context_parts:
+            return {}
+
+        context_addition = "\n\n".join(context_parts)
+
+        # Add instructions for using the context
+        if integration_method == "system_prompt":
+            context_addition += "\n\n## CONTEXT USAGE INSTRUCTIONS\n\n"
+            context_addition += "Use the above knowledge and context information to enhance your transcript generation. "
+            context_addition += "Incorporate relevant facts and maintain consistency with previous slides when appropriate."
+
+        return {
+            'context_addition': context_addition,
+            'integration_point': 'before_instructions' if integration_method == "system_prompt" else 'prefix'
+        }

+ 506 - 0
end-to-end-use-cases/powerpoint-to-voiceover-transcript/src/knowledge/faiss_knowledge.py

@@ -0,0 +1,506 @@
+"""
+FAISS-based Knowledge Manager for enhanced vector search and storage.
+Replaces the numpy-based approach with production-ready vector database.
+"""
+
+import faiss
+import pickle
+import hashlib
+import logging
+from pathlib import Path
+from typing import List, Dict, Any, Optional, Tuple
+from dataclasses import dataclass, field
+import json
+from datetime import datetime
+
+try:
+    from sentence_transformers import SentenceTransformer
+    import numpy as np
+    SENTENCE_TRANSFORMERS_AVAILABLE = True
+except ImportError:
+    SENTENCE_TRANSFORMERS_AVAILABLE = False
+
+logger = logging.getLogger(__name__)
+
+@dataclass
+class KnowledgeChunk:
+    """Enhanced knowledge chunk with FAISS indexing support"""
+    content: str
+    file_path: str
+    section: Optional[str] = None
+    metadata: Optional[Dict[str, Any]] = None
+    chunk_id: int = 0
+    embedding_hash: Optional[str] = None
+    created_at: Optional[str] = None
+
+    def __post_init__(self):
+        if self.created_at is None:
+            self.created_at = datetime.now().isoformat()
+        if self.metadata is None:
+            self.metadata = {}
+
+class FAISSKnowledgeManager:
+    """
+    Production-ready knowledge manager using FAISS for vector search.
+
+    Features:
+    - Multiple index types (Flat, IVF, HNSW)
+    - Persistent caching with automatic invalidation
+    - Incremental document updates
+    - Memory-efficient storage
+    - GPU acceleration support
+    """
+
+    def __init__(
+        self,
+        knowledge_base_dir: str,
+        index_type: str = "flat",
+        embedding_model: str = "all-MiniLM-L6-v2",
+        use_gpu: bool = False
+    ):
+        self.knowledge_base_dir = Path(knowledge_base_dir)
+        self.chunks = []
+        self.index = None
+        self.model = None
+        self.index_type = index_type.lower()
+        self.embedding_model = embedding_model
+        self.use_gpu = use_gpu
+
+        # Cache and metadata
+        self.cache_dir = self.knowledge_base_dir / ".faiss_cache"
+        self.cache_dir.mkdir(exist_ok=True)
+        self.metadata_file = self.cache_dir / "metadata.json"
+
+        # Performance tracking
+        self.stats = {
+            'total_searches': 0,
+            'cache_hits': 0,
+            'index_builds': 0,
+            'last_updated': None
+        }
+
+    def initialize(self) -> None:
+        """Initialize the FAISS knowledge manager"""
+        if not SENTENCE_TRANSFORMERS_AVAILABLE:
+            raise ImportError("sentence_transformers and numpy are required for knowledge base functionality")
+
+        logger.info(f"Initializing FAISS Knowledge Manager with {self.index_type} index")
+
+        # Load embedding model
+        self.model = SentenceTransformer(self.embedding_model)
+        logger.info(f"Loaded embedding model: {self.embedding_model}")
+
+        # Try to load cached index
+        if self._should_rebuild_index():
+            logger.info("Building new FAISS index...")
+            self._build_index()
+            self._save_index()
+        else:
+            logger.info("Loading cached FAISS index...")
+            if not self._load_cached_index():
+                logger.warning("Failed to load cached index, rebuilding...")
+                self._build_index()
+                self._save_index()
+
+    def _should_rebuild_index(self) -> bool:
+        """Check if index needs to be rebuilt based on file changes"""
+        if not self.metadata_file.exists():
+            return True
+
+        try:
+            with open(self.metadata_file, 'r') as f:
+                metadata = json.load(f)
+
+            # Check if knowledge base files have changed
+            current_hash = self._get_knowledge_base_hash()
+            stored_hash = metadata.get('knowledge_base_hash')
+
+            return current_hash != stored_hash
+
+        except Exception as e:
+            logger.warning(f"Error reading metadata: {e}")
+            return True
+
+    def _get_knowledge_base_hash(self) -> str:
+        """Generate hash of all knowledge base files for change detection"""
+        md_files = sorted(self.knowledge_base_dir.rglob("*.md"))
+        hash_content = ""
+
+        for md_file in md_files:
+            try:
+                with open(md_file, 'r', encoding='utf-8') as f:
+                    content = f.read()
+                hash_content += f"{md_file.name}:{len(content)}:{hash(content)}"
+            except Exception as e:
+                logger.warning(f"Error reading {md_file}: {e}")
+
+        return hashlib.md5(hash_content.encode()).hexdigest()
+
+    def _build_index(self) -> None:
+        """Build FAISS index from knowledge base"""
+        self._load_knowledge_base()
+
+        if not self.chunks:
+            logger.warning("No knowledge chunks found")
+            return
+
+        logger.info(f"Processing {len(self.chunks)} knowledge chunks...")
+
+        # Generate embeddings with progress tracking
+        texts = [chunk.content for chunk in self.chunks]
+        embeddings = self.model.encode(
+            texts,
+            show_progress_bar=True,
+            batch_size=32,
+            convert_to_numpy=True
+        )
+
+        # Create FAISS index based on type
+        dimension = embeddings.shape[1]
+
+        if self.index_type == "flat":
+            # Exact search - best for small to medium datasets
+            self.index = faiss.IndexFlatIP(dimension)
+
+        elif self.index_type == "ivf":
+            # Inverted File index - good balance of speed and accuracy
+            nlist = min(100, max(10, len(self.chunks) // 10))
+            quantizer = faiss.IndexFlatIP(dimension)
+            self.index = faiss.IndexIVFFlat(quantizer, dimension, nlist)
+
+            # Train the index
+            logger.info(f"Training IVF index with {nlist} clusters...")
+            self.index.train(embeddings.astype('float32'))
+
+        elif self.index_type == "hnsw":
+            # Hierarchical Navigable Small World - very fast approximate search
+            self.index = faiss.IndexHNSWFlat(dimension, 32)
+            self.index.hnsw.efConstruction = 200
+            self.index.hnsw.efSearch = 50
+
+        else:
+            raise ValueError(f"Unsupported index type: {self.index_type}")
+
+        # Normalize embeddings for cosine similarity BEFORE adding to index
+        embeddings_normalized = embeddings.astype('float32').copy()
+        faiss.normalize_L2(embeddings_normalized)
+
+        # Add embeddings to index
+        self.index.add(embeddings_normalized)
+
+        # Move to GPU if requested and available (after adding data)
+        if self.use_gpu and faiss.get_num_gpus() > 0:
+            logger.info("Moving index to GPU...")
+            gpu_res = faiss.StandardGpuResources()
+            self.index = faiss.index_cpu_to_gpu(gpu_res, 0, self.index)
+
+        # Update statistics
+        self.stats['index_builds'] += 1
+        self.stats['last_updated'] = datetime.now().isoformat()
+
+        logger.info(f"Built {self.index_type.upper()} index with {len(self.chunks)} chunks")
+
+    def _load_knowledge_base(self) -> None:
+        """Load and chunk markdown files from knowledge base"""
+        md_files = list(self.knowledge_base_dir.rglob("*.md"))
+        self.chunks = []
+        chunk_id = 0
+
+        logger.info(f"Loading {len(md_files)} markdown files...")
+
+        for md_file in md_files:
+            try:
+                with open(md_file, 'r', encoding='utf-8') as f:
+                    content = f.read()
+
+                # Enhanced chunking by sections with better parsing
+                chunks = self._chunk_content(content, str(md_file))
+
+                for chunk_content, section_title in chunks:
+                    if chunk_content.strip():
+                        chunk = KnowledgeChunk(
+                            content=chunk_content.strip(),
+                            file_path=str(md_file),
+                            section=section_title,
+                            chunk_id=chunk_id,
+                            metadata={
+                                'file_size': len(content),
+                                'chunk_size': len(chunk_content),
+                                'source_file': md_file.name
+                            }
+                        )
+                        self.chunks.append(chunk)
+                        chunk_id += 1
+
+            except Exception as e:
+                logger.error(f"Error processing {md_file}: {e}")
+
+        logger.info(f"Created {len(self.chunks)} knowledge chunks from {len(md_files)} files")
+
+    def _chunk_content(self, content: str, file_path: str) -> List[Tuple[str, Optional[str]]]:
+        """Enhanced content chunking with better section detection"""
+        chunks = []
+
+        # Split by main headers (# and ##)
+        sections = content.split('\n## ')
+
+        for i, section in enumerate(sections):
+            if not section.strip():
+                continue
+
+            if i == 0:
+                # First section might not have ## prefix
+                if section.startswith('# '):
+                    # Extract title and content
+                    lines = section.split('\n', 1)
+                    title = lines[0].replace('# ', '').strip()
+                    content_part = lines[1] if len(lines) > 1 else ""
+                    chunks.append((content_part, title))
+                else:
+                    chunks.append((section, None))
+            else:
+                # Subsequent sections
+                lines = section.split('\n', 1)
+                section_title = lines[0].strip()
+                section_content = lines[1] if len(lines) > 1 else ""
+
+                # Further split large sections by ### if needed
+                if len(section_content) > 2000:  # Large section threshold
+                    subsections = section_content.split('\n### ')
+                    for j, subsection in enumerate(subsections):
+                        if subsection.strip():
+                            if j == 0:
+                                chunks.append((subsection, section_title))
+                            else:
+                                sub_lines = subsection.split('\n', 1)
+                                sub_title = f"{section_title} - {sub_lines[0].strip()}"
+                                sub_content = sub_lines[1] if len(sub_lines) > 1 else ""
+                                chunks.append((sub_content, sub_title))
+                else:
+                    chunks.append((section_content, section_title))
+
+        return chunks
+
+    def search(
+        self,
+        query: str,
+        top_k: int = 5,
+        similarity_threshold: float = 0.3
+    ) -> List[KnowledgeChunk]:
+        """Search for relevant knowledge chunks using FAISS"""
+        if not self.chunks or self.index is None:
+            logger.warning("No index available for search")
+            return []
+
+        self.stats['total_searches'] += 1
+
+        try:
+            # Encode query
+            query_embedding = self.model.encode([query], convert_to_numpy=True)
+            faiss.normalize_L2(query_embedding.astype('float32'))
+
+            # Search FAISS index
+            scores, indices = self.index.search(query_embedding.astype('float32'), top_k)
+
+            # Filter by similarity threshold and return chunks
+            results = []
+            for score, idx in zip(scores[0], indices[0]):
+                if score >= similarity_threshold and 0 <= idx < len(self.chunks):
+                    chunk = self.chunks[idx]
+                    # Add search score to metadata
+                    chunk.metadata['search_score'] = float(score)
+                    results.append(chunk)
+
+            logger.debug(f"Search query: '{query}' returned {len(results)} results")
+            return results
+
+        except Exception as e:
+            logger.error(f"Search error: {e}")
+            return []
+
+    def add_document(self, file_path: str, content: str) -> bool:
+        """Add new document to existing index"""
+        try:
+            # Process new content into chunks
+            new_chunks = []
+            chunks = self._chunk_content(content, file_path)
+
+            start_id = len(self.chunks)
+            for i, (chunk_content, section_title) in enumerate(chunks):
+                if chunk_content.strip():
+                    chunk = KnowledgeChunk(
+                        content=chunk_content.strip(),
+                        file_path=file_path,
+                        section=section_title,
+                        chunk_id=start_id + i,
+                        metadata={
+                            'file_size': len(content),
+                            'chunk_size': len(chunk_content),
+                            'source_file': Path(file_path).name
+                        }
+                    )
+                    new_chunks.append(chunk)
+
+            if not new_chunks:
+                logger.warning(f"No chunks created from {file_path}")
+                return False
+
+            # Generate embeddings for new chunks
+            texts = [chunk.content for chunk in new_chunks]
+            embeddings = self.model.encode(texts, convert_to_numpy=True)
+
+            # Normalize embeddings before adding to index
+            embeddings_normalized = embeddings.astype('float32').copy()
+            faiss.normalize_L2(embeddings_normalized)
+
+            # Add to index
+            self.index.add(embeddings_normalized)
+
+            # Add to chunks list
+            self.chunks.extend(new_chunks)
+
+            # Save updated index
+            self._save_index()
+
+            logger.info(f"Added {len(new_chunks)} chunks from {file_path}")
+            return True
+
+        except Exception as e:
+            logger.error(f"Error adding document {file_path}: {e}")
+            return False
+
+    def _save_index(self) -> None:
+        """Save FAISS index and chunks to cache"""
+        if self.index is None:
+            return
+
+        try:
+            index_path = self.cache_dir / "faiss.index"
+            chunks_path = self.cache_dir / "chunks.pkl"
+
+            # Save FAISS index (move to CPU first if on GPU)
+            index_to_save = self.index
+            if self.use_gpu and faiss.get_num_gpus() > 0:
+                index_to_save = faiss.index_gpu_to_cpu(self.index)
+
+            faiss.write_index(index_to_save, str(index_path))
+
+            # Save chunks
+            with open(chunks_path, 'wb') as f:
+                pickle.dump(self.chunks, f)
+
+            # Save metadata
+            metadata = {
+                'knowledge_base_hash': self._get_knowledge_base_hash(),
+                'index_type': self.index_type,
+                'embedding_model': self.embedding_model,
+                'total_chunks': len(self.chunks),
+                'created_at': datetime.now().isoformat(),
+                'stats': self.stats
+            }
+
+            with open(self.metadata_file, 'w') as f:
+                json.dump(metadata, f, indent=2)
+
+            logger.info(f"Saved FAISS index with {len(self.chunks)} chunks")
+
+        except Exception as e:
+            logger.error(f"Error saving index: {e}")
+
+    def _load_cached_index(self) -> bool:
+        """Load cached FAISS index and chunks"""
+        index_path = self.cache_dir / "faiss.index"
+        chunks_path = self.cache_dir / "chunks.pkl"
+
+        if not (index_path.exists() and chunks_path.exists()):
+            return False
+
+        try:
+            # Load FAISS index
+            self.index = faiss.read_index(str(index_path))
+
+            # Move to GPU if requested
+            if self.use_gpu and faiss.get_num_gpus() > 0:
+                logger.info("Moving cached index to GPU...")
+                self.index = faiss.index_cpu_to_gpu(faiss.StandardGpuResources(), 0, self.index)
+
+            # Load chunks
+            with open(chunks_path, 'rb') as f:
+                self.chunks = pickle.load(f)
+
+            # Load metadata and stats
+            if self.metadata_file.exists():
+                with open(self.metadata_file, 'r') as f:
+                    metadata = json.load(f)
+                    self.stats.update(metadata.get('stats', {}))
+
+            logger.info(f"Loaded cached index with {len(self.chunks)} chunks")
+            return True
+
+        except Exception as e:
+            logger.error(f"Failed to load cached index: {e}")
+            return False
+
+    def rebuild_index(self) -> None:
+        """Force rebuild of the index"""
+        logger.info("Force rebuilding FAISS index...")
+        self._build_index()
+        self._save_index()
+
+    def clear_cache(self) -> None:
+        """Clear all cached data"""
+        try:
+            import shutil
+            if self.cache_dir.exists():
+                shutil.rmtree(self.cache_dir)
+                self.cache_dir.mkdir(exist_ok=True)
+            logger.info("Cleared FAISS cache")
+        except Exception as e:
+            logger.error(f"Error clearing cache: {e}")
+
+    def get_stats(self) -> Dict[str, Any]:
+        """Get comprehensive knowledge base statistics"""
+        stats = {
+            'total_chunks': len(self.chunks),
+            'index_type': self.index_type,
+            'embedding_model': self.embedding_model,
+            'model_loaded': self.model is not None,
+            'index_loaded': self.index is not None,
+            'use_gpu': self.use_gpu,
+            'cache_dir': str(self.cache_dir),
+            'knowledge_base_dir': str(self.knowledge_base_dir),
+        }
+
+        # Add FAISS-specific stats
+        if self.index:
+            stats.update({
+                'index_size': self.index.ntotal,
+                'dimension': self.index.d,
+                'is_trained': getattr(self.index, 'is_trained', True)
+            })
+
+        # Add performance stats
+        stats.update(self.stats)
+
+        # Memory usage estimation
+        if self.chunks:
+            total_content_size = sum(len(chunk.content) for chunk in self.chunks)
+            stats['content_size_mb'] = total_content_size / (1024 * 1024)
+            stats['avg_chunk_size'] = total_content_size / len(self.chunks)
+
+        return stats
+
+    def get_chunk_by_id(self, chunk_id: int) -> Optional[KnowledgeChunk]:
+        """Get specific chunk by ID"""
+        for chunk in self.chunks:
+            if chunk.chunk_id == chunk_id:
+                return chunk
+        return None
+
+    def search_by_file(self, file_path: str) -> List[KnowledgeChunk]:
+        """Get all chunks from a specific file"""
+        return [chunk for chunk in self.chunks if chunk.file_path == file_path]
+
+
+# Backward compatibility alias
+MarkdownKnowledgeManager = FAISSKnowledgeManager

+ 213 - 13
end-to-end-use-cases/powerpoint-to-voiceover-transcript/src/processors/unified_transcript_generator.py

@@ -1,14 +1,18 @@
 """Unified transcript generation processor with optional narrative continuity."""
 
 import json
+import logging
 from pathlib import Path
 from typing import Any, Dict, List, Optional, Union
 
 import pandas as pd
 from tqdm import tqdm
 
-from ..config.settings import get_processing_config, get_system_prompt
-from ..core.llama_client import LlamaClient
+from ..config.settings import get_processing_config, get_system_prompt, is_knowledge_enabled
+from ..core.groq_client import GroqClient
+
+
+logger = logging.getLogger(__name__)
 
 
 class SlideContext:
@@ -35,6 +39,8 @@ class UnifiedTranscriptProcessor:
         api_key: Optional[str] = None,
         use_narrative: bool = True,
         context_window_size: int = 5,
+        knowledge_base_dir: Optional[str] = None,
+        enable_knowledge: Optional[bool] = None,
     ):
         """
         Initialize unified transcript processor.
@@ -43,13 +49,170 @@ class UnifiedTranscriptProcessor:
             api_key: Llama API key. If None, will be loaded from config/environment.
             use_narrative: Whether to use narrative continuity (default: True)
             context_window_size: Number of previous slides to include in context when use_narrative=True (default: 5)
+            knowledge_base_dir: Path to knowledge base directory (optional)
+            enable_knowledge: Override knowledge base enable setting (optional)
         """
-        self.client = LlamaClient(api_key=api_key)
+        self.client = GroqClient(api_key=api_key)
         self.processing_config = get_processing_config()
         self.use_narrative = use_narrative
         self.context_window_size = context_window_size
         self.slide_contexts: List[SlideContext] = []
 
+        # Knowledge base integration
+        self.knowledge_manager = None
+        self.context_manager = None
+        self.enable_knowledge = enable_knowledge if enable_knowledge is not None else is_knowledge_enabled()
+
+        if self.enable_knowledge:
+            self._initialize_knowledge_components(knowledge_base_dir)
+    def _initialize_knowledge_components(self, knowledge_base_dir: Optional[str] = None) -> None:
+        """Initialize knowledge base components with error handling."""
+        try:
+            from ..knowledge.faiss_knowledge import FAISSKnowledgeManager
+            from ..knowledge.context_manager import ContextManager
+            from ..config.settings import load_config
+
+            # Get configuration
+            config = load_config()
+            knowledge_config = config.get('knowledge', {})
+
+            # Get knowledge base directory from config if not provided
+            if knowledge_base_dir is None:
+                knowledge_base_dir = knowledge_config.get('knowledge_base_dir', 'knowledge_base')
+
+            # Ensure we have a valid path
+            if not knowledge_base_dir:
+                raise ValueError("Knowledge base directory not specified")
+
+            # Get FAISS configuration
+            vector_config = knowledge_config.get('vector_store', {})
+            embedding_config = knowledge_config.get('embedding', {})
+
+            # Initialize FAISS knowledge manager with configuration
+            self.knowledge_manager = FAISSKnowledgeManager(
+                knowledge_base_dir=knowledge_base_dir,
+                index_type=vector_config.get('index_type', 'flat'),
+                embedding_model=embedding_config.get('model_name', 'all-MiniLM-L6-v2'),
+                use_gpu=vector_config.get('use_gpu', False)
+            )
+            self.knowledge_manager.initialize()
+
+            # Initialize context manager
+            self.context_manager = ContextManager()
+
+            logger.info("FAISS knowledge base components initialized successfully")
+
+        except Exception as e:
+            logger.warning(f"Failed to initialize knowledge base: {e}")
+            # Graceful degradation - disable knowledge features
+            self.enable_knowledge = False
+            self.knowledge_manager = None
+            self.context_manager = None
+
+    def _retrieve_knowledge_chunks(self, slide_content: str, speaker_notes: str) -> tuple[List[Any], Dict[str, Any]]:
+        """
+        Retrieve relevant knowledge chunks for the current slide.
+
+        Args:
+            slide_content: Content extracted from slide image (if available)
+            speaker_notes: Speaker notes for the slide
+
+        Returns:
+            Tuple of (knowledge chunks, knowledge metadata)
+        """
+        if not self.enable_knowledge or not self.knowledge_manager:
+            return [], {}
+
+        try:
+            # Combine slide content and speaker notes for search query
+            search_query = f"{slide_content} {speaker_notes}".strip()
+
+            if not search_query:
+                return [], {}
+
+            # Search for relevant chunks
+            chunks = self.knowledge_manager.search(search_query)
+
+            # Create metadata about knowledge usage
+            knowledge_metadata = {
+                'search_query': search_query[:200] + '...' if len(search_query) > 200 else search_query,
+                'chunks_found': len(chunks),
+                'knowledge_sources': [],
+                'knowledge_sections': [],
+                'search_scores': []
+            }
+
+            # Extract metadata from chunks
+            for chunk in chunks:
+                if hasattr(chunk, 'file_path'):
+                    source_file = Path(chunk.file_path).name
+                    if source_file not in knowledge_metadata['knowledge_sources']:
+                        knowledge_metadata['knowledge_sources'].append(source_file)
+
+                if hasattr(chunk, 'section') and chunk.section:
+                    if chunk.section not in knowledge_metadata['knowledge_sections']:
+                        knowledge_metadata['knowledge_sections'].append(chunk.section)
+
+                if hasattr(chunk, 'metadata') and chunk.metadata and 'search_score' in chunk.metadata:
+                    knowledge_metadata['search_scores'].append(round(chunk.metadata['search_score'], 3))
+
+            logger.debug(f"Retrieved {len(chunks)} knowledge chunks for query: {search_query[:100]}...")
+
+            return chunks, knowledge_metadata
+
+        except Exception as e:
+            logger.warning(f"Failed to retrieve knowledge chunks: {e}")
+            return [], {}
+
+    def _build_context_bundle(
+        self,
+        knowledge_chunks: List[Any],
+        slide_number: Optional[int] = None
+    ) -> Optional[Any]:
+        """
+        Build context bundle combining knowledge and narrative contexts.
+
+        Args:
+            knowledge_chunks: Retrieved knowledge chunks
+            slide_number: Current slide number for narrative context
+
+        Returns:
+            ContextBundle or None if no context available
+        """
+        if not self.enable_knowledge or not self.context_manager:
+            return None
+
+        try:
+            # Get narrative context if using narrative mode
+            narrative_context = None
+            previous_slides = None
+
+            if self.use_narrative and slide_number is not None and self.slide_contexts:
+                # Build narrative context from previous slides
+                recent_contexts = self.slide_contexts[-self.context_window_size:]
+                narrative_parts = []
+
+                for context in recent_contexts:
+                    narrative_parts.append(
+                        f"Slide {context.slide_number} - {context.title}: {context.transcript[:200]}..."
+                    )
+
+                narrative_context = "\n".join(narrative_parts)
+                previous_slides = [ctx.to_dict() for ctx in recent_contexts]
+
+            # Create context bundle
+            context_bundle = self.context_manager.create_context_bundle(
+                knowledge_chunks=knowledge_chunks,
+                narrative_context=narrative_context,
+                previous_slides=previous_slides
+            )
+
+            return context_bundle
+
+        except Exception as e:
+            logger.warning(f"Failed to build context bundle: {e}")
+            return None
+
     def _build_context_prompt(
         self, current_slide_number: int, slide_contexts: List[SlideContext]
     ) -> str:
@@ -112,9 +275,9 @@ When generating the transcript for this slide, ensure:
         system_prompt: Optional[str] = None,
         slide_number: Optional[int] = None,
         slide_title: str = "",
-    ) -> str:
+    ) -> tuple[str, Dict[str, Any]]:
         """
-        Process a single slide to generate transcript.
+        Process a single slide to generate transcript with optional knowledge integration.
 
         Args:
             image_path: Path to the slide image
@@ -124,19 +287,33 @@ When generating the transcript for this slide, ensure:
             slide_title: Title of the current slide (used for narrative continuity)
 
         Returns:
-            Generated transcript text
+            Tuple of (generated transcript text, knowledge metadata)
         """
+        # Retrieve knowledge chunks if knowledge base is enabled
+        knowledge_chunks = []
+        knowledge_metadata = {}
+        if self.enable_knowledge:
+            # Use slide title and speaker notes as search context
+            slide_content = slide_title  # Could be enhanced with OCR in the future
+            knowledge_chunks, knowledge_metadata = self._retrieve_knowledge_chunks(slide_content, speaker_notes)
+
+        # Build context bundle
+        context_bundle = None
+        if knowledge_chunks or (self.use_narrative and slide_number is not None):
+            context_bundle = self._build_context_bundle(knowledge_chunks, slide_number)
+
         if self.use_narrative and slide_number is not None:
-            # Use narrative-aware processing
+            # Use narrative-aware processing with optional knowledge integration
             enhanced_prompt = self._build_context_prompt(
                 slide_number, self.slide_contexts
             )
 
-            # Generate transcript with context
+            # Generate transcript with context bundle
             transcript = self.client.generate_transcript(
                 image_path=str(image_path),
                 speaker_notes=speaker_notes,
                 system_prompt=enhanced_prompt,
+                context_bundle=context_bundle,
                 stream=False,
             )
 
@@ -148,15 +325,17 @@ When generating the transcript for this slide, ensure:
             )
             self.slide_contexts.append(slide_context)
 
-            return transcript
+            return transcript, knowledge_metadata
         else:
-            # Use standard processing
-            return self.client.generate_transcript(
+            # Use standard processing with optional knowledge integration
+            transcript = self.client.generate_transcript(
                 image_path=str(image_path),
                 speaker_notes=speaker_notes,
                 system_prompt=system_prompt,
+                context_bundle=context_bundle,
                 stream=False,
             )
+            return transcript, knowledge_metadata
 
     def process_slides_dataframe(
         self,
@@ -200,9 +379,12 @@ When generating the transcript for this slide, ensure:
 
             image_path = output_dir / slide_filename
 
+            # Initialize knowledge metadata for this slide
+            knowledge_metadata = {}
+
             # Generate transcript
             if self.use_narrative:
-                transcript = self.process_single_slide(
+                transcript, knowledge_metadata = self.process_single_slide(
                     image_path=image_path,
                     speaker_notes=speaker_notes,
                     system_prompt=system_prompt,
@@ -214,7 +396,7 @@ When generating the transcript for this slide, ensure:
                     len(self.slide_contexts) - 1, self.context_window_size
                 )
             else:
-                transcript = self.process_single_slide(
+                transcript, knowledge_metadata = self.process_single_slide(
                     image_path=image_path,
                     speaker_notes=speaker_notes,
                     system_prompt=system_prompt,
@@ -223,6 +405,24 @@ When generating the transcript for this slide, ensure:
             # Add transcript to dataframe
             df_copy.loc[i, "ai_transcript"] = transcript
 
+            # Add knowledge metadata if available
+            if knowledge_metadata:
+                df_copy.loc[i, "knowledge_chunks_used"] = knowledge_metadata.get('chunks_found', 0)
+                df_copy.loc[i, "knowledge_sources"] = ', '.join(knowledge_metadata.get('knowledge_sources', []))
+                df_copy.loc[i, "knowledge_sections"] = ', '.join(knowledge_metadata.get('knowledge_sections', []))
+                df_copy.loc[i, "knowledge_search_query"] = knowledge_metadata.get('search_query', '')
+                if knowledge_metadata.get('search_scores'):
+                    df_copy.loc[i, "avg_knowledge_score"] = sum(knowledge_metadata['search_scores']) / len(knowledge_metadata['search_scores'])
+                else:
+                    df_copy.loc[i, "avg_knowledge_score"] = 0.0
+            else:
+                # Initialize knowledge metadata columns with default values
+                df_copy.loc[i, "knowledge_chunks_used"] = 0
+                df_copy.loc[i, "knowledge_sources"] = ""
+                df_copy.loc[i, "knowledge_sections"] = ""
+                df_copy.loc[i, "knowledge_search_query"] = ""
+                df_copy.loc[i, "avg_knowledge_score"] = 0.0
+
         # Save context information if requested and using narrative mode
         if save_context and self.use_narrative:
             self._save_context_information(output_dir)

+ 194 - 0
end-to-end-use-cases/powerpoint-to-voiceover-transcript/src/utils/transcript_display.py

@@ -0,0 +1,194 @@
+"""
+Utility functions for displaying transcripts with knowledge enhancement details.
+"""
+
+import pandas as pd
+from typing import Optional, List, Dict, Any
+from pathlib import Path
+
+
+def display_enhanced_transcripts(
+    processed_df: pd.DataFrame,
+    knowledge_manager=None,
+    num_slides: int = 5,
+    show_knowledge_details: bool = True,
+    show_search_scores: bool = True
+) -> None:
+    """
+    Display transcripts with knowledge enhancement details.
+
+    Args:
+        processed_df: DataFrame with processed transcripts
+        knowledge_manager: FAISS knowledge manager instance
+        num_slides: Number of slides to display
+        show_knowledge_details: Whether to show knowledge chunk details
+        show_search_scores: Whether to show similarity scores
+    """
+
+    print(f'Displaying first {num_slides} slide transcripts with knowledge enhancement\n')
+    print('=' * 100)
+
+    for idx, row in processed_df.head(num_slides).iterrows():
+        print(f'\nSLIDE {row["slide_number"]} - {row["slide_title"]}')
+        print('=' * 80)
+
+        # Display transcript
+        print('\nTRANSCRIPT:')
+        print(f'"{row["ai_transcript"]}"')
+
+        # Display knowledge enhancement details if available
+        if show_knowledge_details and knowledge_manager:
+            _display_knowledge_details(row, knowledge_manager, show_search_scores)
+
+        # Display basic knowledge stats from DataFrame if available
+        elif 'knowledge_chunks_used' in row and pd.notna(row['knowledge_chunks_used']):
+            _display_basic_knowledge_stats(row)
+
+        print('\n' + '-' * 80)
+
+
+def _display_knowledge_details(
+    row: pd.Series,
+    knowledge_manager,
+    show_search_scores: bool = True
+) -> None:
+    """Display detailed knowledge chunk information."""
+
+    # Try to reconstruct the search that was performed
+    search_query = ""
+    if 'knowledge_search_query' in row and pd.notna(row['knowledge_search_query']):
+        search_query = row['knowledge_search_query']
+    else:
+        # Fallback: use slide title and notes
+        search_query = f"{row.get('slide_title', '')} {row.get('speaker_notes', '')}".strip()
+
+    if not search_query:
+        print('\nKNOWLEDGE ENHANCEMENT: No search query available')
+        return
+
+    try:
+        # Perform the same search to get chunk details
+        chunks = knowledge_manager.search(search_query, top_k=5)
+
+        if not chunks:
+            print('\nKNOWLEDGE ENHANCEMENT: No relevant knowledge found')
+            return
+
+        print(f'\nKNOWLEDGE ENHANCEMENT:')
+        print(f'   Search Query: "{search_query[:100]}{"..." if len(search_query) > 100 else ""}"')
+        print(f'   Chunks Found: {len(chunks)}')
+
+        print('\nKNOWLEDGE CHUNKS USED:')
+
+        for i, chunk in enumerate(chunks, 1):
+            print(f'\n   Chunk {i}:')
+            print(f'      Source: {Path(chunk.file_path).name}')
+
+            if hasattr(chunk, 'section') and chunk.section:
+                print(f'      Section: {chunk.section}')
+
+            if show_search_scores and hasattr(chunk, 'metadata') and chunk.metadata:
+                score = chunk.metadata.get('search_score', 'N/A')
+                if score != 'N/A':
+                    print(f'      Similarity Score: {score:.3f}')
+
+            # Show content preview
+            content_preview = chunk.content[:200] + '...' if len(chunk.content) > 200 else chunk.content
+            print(f'      Content: "{content_preview}"')
+
+    except Exception as e:
+        print(f'\nKNOWLEDGE ENHANCEMENT: Error retrieving details - {e}')
+
+
+def _display_basic_knowledge_stats(row: pd.Series) -> None:
+    """Display basic knowledge statistics from DataFrame."""
+
+    print(f'\nKNOWLEDGE ENHANCEMENT:')
+
+    if 'knowledge_chunks_used' in row and pd.notna(row['knowledge_chunks_used']):
+        print(f'   Chunks Used: {int(row["knowledge_chunks_used"])}')
+
+    if 'knowledge_sources' in row and pd.notna(row['knowledge_sources']) and row['knowledge_sources']:
+        sources = row['knowledge_sources'].split(', ') if isinstance(row['knowledge_sources'], str) else []
+        print(f'   Sources: {", ".join(sources)}')
+
+    if 'knowledge_sections' in row and pd.notna(row['knowledge_sections']) and row['knowledge_sections']:
+        sections = row['knowledge_sections'].split(', ') if isinstance(row['knowledge_sections'], str) else []
+        print(f'   Sections: {", ".join(sections)}')
+
+    if 'avg_knowledge_score' in row and pd.notna(row['avg_knowledge_score']):
+        print(f'   Avg Similarity Score: {row["avg_knowledge_score"]:.3f}')
+
+
+def display_knowledge_base_summary(knowledge_manager) -> None:
+    """Display summary of knowledge base contents."""
+
+    if not knowledge_manager:
+        print("No knowledge manager available")
+        return
+
+    try:
+        stats = knowledge_manager.get_stats()
+
+        print('\nKNOWLEDGE BASE SUMMARY')
+        print('=' * 50)
+        print(f'Total Chunks: {stats.get("total_chunks", 0)}')
+        print(f'Index Type: {stats.get("index_type", "unknown").upper()}')
+        print(f'Embedding Model: {stats.get("embedding_model", "unknown")}')
+        print(f'Content Size: {stats.get("content_size_mb", 0):.1f} MB')
+        print(f'Avg Chunk Size: {stats.get("avg_chunk_size", 0):.0f} characters')
+
+        # Show knowledge sources
+        if hasattr(knowledge_manager, 'chunks') and knowledge_manager.chunks:
+            sources = set()
+            sections = set()
+
+            for chunk in knowledge_manager.chunks:
+                if hasattr(chunk, 'file_path'):
+                    sources.add(Path(chunk.file_path).name)
+                if hasattr(chunk, 'section') and chunk.section:
+                    sections.add(chunk.section)
+
+            print(f'\nKnowledge Sources ({len(sources)}):')
+            for source in sorted(sources):
+                print(f'  • {source}')
+
+            if sections:
+                print(f'\nKnowledge Sections ({len(sections)}):')
+                for section in sorted(list(sections)[:10]):  # Show first 10
+                    print(f'  • {section}')
+                if len(sections) > 10:
+                    print(f'  ... and {len(sections) - 10} more')
+
+        print('=' * 50)
+
+    except Exception as e:
+        print(f"Error displaying knowledge base summary: {e}")
+
+
+# Convenience function for notebook use
+def show_transcripts_with_knowledge(
+    processed_df: pd.DataFrame,
+    knowledge_manager=None,
+    num_slides: int = 5
+) -> None:
+    """
+    Convenience function for displaying transcripts with knowledge details in notebooks.
+
+    Usage in notebook:
+        from src.utils.transcript_display import show_transcripts_with_knowledge
+        show_transcripts_with_knowledge(processed_df, knowledge_manager, num_slides=3)
+    """
+
+    # Display knowledge base summary first
+    if knowledge_manager:
+        display_knowledge_base_summary(knowledge_manager)
+
+    # Display enhanced transcripts
+    display_enhanced_transcripts(
+        processed_df,
+        knowledge_manager,
+        num_slides=num_slides,
+        show_knowledge_details=True,
+        show_search_scores=True
+    )

文件差異過大導致無法顯示
+ 2165 - 136
end-to-end-use-cases/powerpoint-to-voiceover-transcript/uv.lock