# Generating documentation for an entire codebase

*Copyright (c) Meta Platforms, Inc. and affiliates.
This software may be used and distributed according to the terms of the Llama Community License Agreement.*

<a href="https://colab.research.google.com/github/meta-llama/llama-cookbook/blob/main/end-to-end-use-cases/generating-codebase-docs/generating-codebase-docs.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

This tutorial shows you how to build an automated documentation generator for source code repositories. Using Llama 4 Scout, you'll create a "Repo2Docs" system that analyzes an entire codebase and produces a comprehensive README with architectural diagrams and component summaries.

While traditional documentation tools require manual annotation or simple extraction, this approach uses Llama 4's large context window and code understanding capabilities to generate meaningful, contextual documentation that explains not just what the code does, but how components work together.

## What you will learn

- **Build a multi-stage AI pipeline** that performs progressive analysis, from individual files to the complete architecture.
- **Leverage Llama 4 Scout's large context window** to analyze entire source files and repositories without complex chunking strategies.
- **Use the Meta Llama API** to access Llama 4 models.
- **Generate production-ready documentation**, including Mermaid diagrams that visualize your repository's architecture.

| Component | Choice | Why |
|:----------|:-------|:----|
| **Model** | Llama 4 Scout | Large context window (up to 10M tokens) and Mixture-of-Experts (MoE) architecture for efficient, high-quality analysis. |
| **Infrastructure** | Meta Llama API | Provides serverless, production-ready access to Llama 4 models using the `llama_api_client` SDK. |
| **Architecture** | Progressive Pipeline | Deconstructs the complex task of repository analysis into manageable, sequential stages for scalability and efficiency. |
---

**Note on Inference Providers:** This tutorial uses the Llama API for demonstration purposes. However, you can run Llama 4 models with any preferred inference provider. Common examples include [Amazon Bedrock](https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters-meta.html) and [Together AI](https://together.ai/llama). The core logic of this tutorial can be adapted to any of these providers.

## Problem: Documentation debt

Documentation debt is a persistent challenge in software development. As codebases evolve, manual documentation efforts often fall behind, leading to outdated, inconsistent, or missing information. This slows down developer onboarding and makes maintenance more difficult.

## Solution: An automated documentation pipeline

This tutorial's solution is a multi-stage pipeline that systematically analyzes a repository to produce a comprehensive `README.md` file. The system works by progressively analyzing your repository in multiple stages:

```mermaid
flowchart LR
    A[GitHub Repo] --> B[Step 1: File Analysis]
    B --> C[Step 2: <br> Repository Overview]
    C --> D[Step 3: <br> Architecture Analysis]
    D --> E[Step 4: Final README]
```

By breaking down the complex task of repository analysis into manageable stages, you can process repositories of any size efficiently. The large context window of Llama 4 Scout is sufficient to analyze entire source files without complex chunking strategies, resulting in high-quality documentation that captures both fine-grained details and architectural patterns.

## Prerequisites

Before you begin, ensure you have a Llama API key. If you do not have a Llama API key, please get one from [Meta Llama API](https://llama.developer.meta.com/).

Remember, we use the Llama API for this tutorial, but you can adapt this section to use your preferred inference provider.

## Install dependencies

You will need a few libraries for this project: `tiktoken` for accurate token counting, `tqdm` for progress bars, and the official `llama-api-client`.

In [1]:
# Install dependencies
!pip install --quiet tiktoken llama-api-client tqdm

## Imports & Llama API client setup

Import the necessary modules and initialize the `LlamaAPIClient`. This requires a Llama API key to be available as an environment variable.

In [2]:
import os, sys, re
import tempfile
import textwrap
import urllib.request
import zipfile
from pathlib import Path
from typing import Dict, List, Tuple
from urllib.parse import urlparse
import json
import pprint
from tqdm import tqdm
import boto3
import tiktoken
from llama_api_client import LlamaAPIClient

# --- Llama client ---
API_KEY = os.getenv("LLAMA_API_KEY")
if not API_KEY:
    sys.exit("‚ùå  Please set the LLAMA_API_KEY environment variable.")

client = LlamaAPIClient(api_key=API_KEY)

### Model Selection

For this tutorial, you'll use **Llama 4 Scout**. Its large context window is well-suited for ingesting and analyzing entire source code files, which is a key requirement for this use case. While Llama 4 Scout supports up to 10M tokens, the Llama API currently supports 128k tokens.

In [3]:
# --- Constants & Configuration ---
LLM_MODEL = "Llama-4-Scout-17B-16E-Instruct-FP8"
CTX_WINDOW = 128000  # Context window for Llama API

## Step 1: Download the repository

First, you'll download the target repository. This tutorial analyzes the official [Meta Llama repository](https://github.com/facebookresearch/llama), but you can adapt it to any public GitHub repository.

The code downloads the repository as ZIP archive (faster than git clone, avoids .git metadata) and extracts to a temporary directory for isolated processing.

In [None]:
REPO_URL = "https://github.com/facebookresearch/llama"
BRANCH_NAME = "main" # The default branch to download

In [None]:
base_url = REPO_URL.rstrip("/").removesuffix(".git")
repo_zip_url = f"{base_url}/archive/refs/heads/{BRANCH_NAME}.zip"

# Create a temporary directory to work in
tmpdir_obj = tempfile.TemporaryDirectory()
tmpdir = Path(tmpdir_obj.name)

# Download the repository ZIP file
zip_path = tmpdir / "repo.zip"
print(f"üì• Downloading repository from {repo_zip_url}...")
urllib.request.urlretrieve(repo_zip_url, zip_path)

# Extract the archive
print("üì¶ Extracting files...")
with zipfile.ZipFile(zip_path, 'r') as zf:
    zf.extractall(tmpdir)
extracted_root = next(p for p in tmpdir.iterdir() if p.is_dir())
print(f"‚úÖ Extracted to: {extracted_root}")

üì• Downloading repository from https://github.com/facebookresearch/llama/archive/refs/heads/main.zip...
üì¶ Extracting files...
‚úÖ Extracted to: /var/folders/sz/kf8w7j1x1v790jxs8k2gl72c0000gn/T/tmptwo_kdt5/llama-main


## Step 2: Analyze individual files

In this step, you'll generate a concise summary for each relevant file in the repository. This is the first step in the progressive analysis pipeline.

**File selection strategy**: To ensure the analysis is both comprehensive and efficient, you'll selectively process files based on their extension and name (`should_include_file`). This avoids summarizing binary files, build artifacts, or other content that is not relevant to documentation.

The list below provides a general-purpose starting point, but you should customize it for your target repository. For a large project, consider what file types contain the most meaningful source code and configuration, and start with those.

In [6]:
# Allowlist of file extensions to summarize
INCLUDE_EXTENSIONS = {
    ".py", # Python
    ".js", ".jsx", ".ts", ".tsx", # JS/Typescript
    ".md", ".txt", # Text
    ".json", ".yaml", ".yml", ".toml", # Config
    ".sh", ".css", ".html",
}
INCLUDE_FILENAMES = {"Dockerfile", "Makefile"} # Common files without extension

def should_include_file(file_path: Path, extracted_root: Path) -> bool:
    """Checks if a file should be included for documentation based on its path and type."""
    
    if not file_path.is_file(): # Must be a file.
        return False

    rel_path = file_path.relative_to(extracted_root)
    if any(part.startswith('.') for part in rel_path.parts): # Exclude hidden files/folders.
        return False

    if ( # Must be in our allow-list of extensions or filenames.
        file_path.suffix.lower() in INCLUDE_EXTENSIONS
        or file_path.name in INCLUDE_FILENAMES
    ):
        return True

    return False

**Prompt strategy for file summaries**: The prompt for this phase instructs Llama 4 to elicit summaries that focus on a file's purpose and its role within the project, rather than a line-by-line description of its implementation. This is a critical step for generating a high-level, conceptual understanding of the codebase.

In [None]:
MAX_COMPLETION_TOKENS_FILE = 400 # Max tokens for file summary
# To keep this tutorial straightforward, we'll skip files larger than 1MB.
# For a production system, you might implement a chunking strategy for large files.
MAX_FILE_SIZE = 1_000_000

def summarize_file_content(file_path: str, file_content: str) -> str:
    """Summarizes the content of a single file."""
    sys_prompt = (
        "You are a senior software engineer creating a concise summary of a "
        "source file for a project's README.md."
    )
    user_prompt = textwrap.dedent(
        f"""\
        Please summarize the following file: `{file_path}`.

        The summary should be a **concise paragraph** (around 40-60 words) that 
        explains the file's primary purpose, its main functions or classes, and how 
        it fits into the broader project. Focus on the *what* and *why*, not a 
        line-by-line explanation of the *how*.

        ```
        {file_content}
        ```
        """
    )
    try:
        resp = client.chat.completions.create(
            model=LLM_MODEL,
            messages=[
                {"role": "system", "content": sys_prompt},
                {"role": "user", "content": user_prompt},
            ],
            temperature=0.1,  # Low temperature for deterministic summaries
            max_tokens=MAX_COMPLETION_TOKENS_FILE,
        )
        return resp.completion_message.content.text
    except Exception as e:
        print(f"    Error summarizing file: {e}")
        return ""  # Return empty string on failure

In [None]:
# --- Summarize relevant files ---
print("\n--- Summarizing individual files ---")
file_summaries: Dict[str, str] = {}
files_to_process = list(extracted_root.rglob("*"))

for file_path in tqdm(files_to_process, desc="üîç Summarizing files", unit="file"):
    # First, check if the file type is one we want to process.
    if (
        not should_include_file(file_path, extracted_root) # valid file for summarization
        or file_path.stat().st_size > MAX_FILE_SIZE
        or file_path.stat().st_size == 0
    ):
        continue

    rel_name = str(file_path.relative_to(extracted_root))
    try:
        text = file_path.read_text(encoding="utf-8")
    except UnicodeDecodeError:
        continue
    
    if not text.strip():
        continue
    
    # With a large context window, we can summarize the whole file at once.
    summary = summarize_file_content(rel_name, text)
    if summary:
        file_summaries[rel_name] = summary

print(f"‚úÖ Summarized {len(file_summaries)} files.")


--- Summarizing individual files ---


üîç Summarising files: 100%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 22/22 [00:28<00:00,  1.29s/file]

‚úÖ Summarized 15 files.





In [9]:
pprint.pprint(file_summaries)

{'CODE_OF_CONDUCT.md': 'The `CODE_OF_CONDUCT.md` file outlines the expected '
                       'behavior and standards for contributors and '
                       'maintainers of the project, aiming to create a '
                       'harassment-free and welcoming environment. It defines '
                       'acceptable and unacceptable behavior, roles and '
                       'responsibilities, and procedures for reporting and '
                       'addressing incidents, promoting a positive and '
                       'inclusive community.',
 'CONTRIBUTING.md': 'Here is a concise summary of the `CONTRIBUTING.md` file:\n'
                    '\n'
                    'The `CONTRIBUTING.md` file outlines the guidelines and '
                    'processes for contributing to the Llama project. It '
                    'provides instructions for submitting pull requests, '
                    'including bug fixes, improvements, and new features, as '
               

## Step 3: Create repository overview

After summarizing each file, the next step is to synthesize this information into a high-level repository overview. This overview provides a starting point for a user to understand the project's purpose and structure.

You'll prompt Llama 4 to generate three key sections based on the file summaries from the previous step:
1.  **Project Overview**: A short, descriptive paragraph that explains the repository's main purpose.
2.  **Key Components**: A bulleted list of the most important files, providing a quick look at the core logic.
3.  **Getting Started**: A brief instruction on how to install dependencies and run the project.

This prompt leverages the previously generated file summaries as context, enabling the model to create an accurate and cohesive overview without re-analyzing the raw source code.

In [None]:
MAX_COMPLETION_TOKENS_REPO = 600 # Max tokens for repo overview

def build_repo_overview(file_summaries: Dict[str, str]) -> str:
    """Creates the high-level Overview and Key Components sections."""
    bullets = "\n".join(f"- **{n}**: {s}" for n, s in file_summaries.items())
    sys_prompt = (
        "You are an expert technical writer. Draft a high-level overview "
        "for the root of a README.md."
    )
    user_prompt = textwrap.dedent(
        f"""\
        Below is a list of source files with their summaries.

        1. Write an **'Overview'** section (‚âà3-4 sentences) explaining the purpose of the repository.
        2. Follow it with a **'Key Components'** bullet list (max 6 bullets) referencing the files.
        3. Close with a short 'Getting Started' hint: `pip install -r requirements.txt` etc.

        ---
        FILE SUMMARIES
        {bullets}
        """
    )
    try:
        resp = client.chat.completions.create(
            model=LLM_MODEL,
            messages=[
                {"role": "system", "content": sys_prompt},
                {"role": "user", "content": user_prompt},
            ],
            temperature=0.1,
            max_tokens=MAX_COMPLETION_TOKENS_REPO,
        )
        return resp.completion_message.content.text
    except Exception as e:
        print(f"    Error creating repo overview: {e}")
        return ""

In [11]:
# --- Create High-Level Repo Overview ---
print("\n--- Building high-level repository overview ---")
repo_overview = build_repo_overview(file_summaries)
print("‚úÖ Overview created.")


--- Building high-level repository overview ---
‚úÖ Overview created.


In [12]:
print(repo_overview)

Here is a high-level overview for the root of a README.md:

## Overview

This repository provides a comprehensive framework for utilizing the Llama large language model, including model architecture, training data, and example usage. The project aims to facilitate the development of natural language processing applications, while promoting responsible use and community engagement. By providing a range of tools and resources, this repository enables developers and researchers to explore the capabilities and limitations of the Llama model. The repository is structured to support easy integration, modification, and extension of the model.

## Key Components

* **llama/generation.py**: Core logic for text generation using the Llama model
* **llama/model.py**: Transformer-based model architecture definition
* **llama/tokenizer.py**: Tokenizer class using SentencePiece for text encoding and decoding
* **example_text_completion.py**: Example usage of the Llama model for text completion tasks


## Step 4: Analyze repository architecture

A high-level overview is useful, but a deep architectural understanding requires analyzing how components interact. This phase generates that deeper analysis.

### Two-step approach to architecture analysis

Analyzing an entire codebase for architectural patterns is complex. Instead of passing all the code to the model at once, you'll use a more strategic, two-step approach that mirrors how a human architect would work:

1.  **AI-driven file selection**: First, you use Llama 4 to identify the most architecturally significant files. The model is prompted to select files that represent the core logic, primary entry points, or key data structures, based on the summaries generated earlier. This step efficiently filters the codebase down to its most critical components.
2.  **Deep-dive analysis**: With the key files identified, you perform a much deeper analysis. While only the full source code of these selected files is provided, the model also receives the summaries of *all* files generated in the first step. This ensures it has broad, high-level context on the entire repository when it performs its deep analysis.

This two-step process is highly effective because it focuses the model's analytical power on the most important parts of the code, enabling it to generate high-quality architectural insights that are difficult to achieve with a less focused approach.

In [None]:
def select_important_files(file_summaries: Dict[str, str]) -> List[str]:
    """Uses an LLM to select the most architecturally significant files."""
    bullets = "\n".join(f"- **{n}**: {s}" for n, s in file_summaries.items())
    sys_prompt = (
        "You are a senior software architect. Your task is to identify the "
        "most critical files for understanding a repository's architecture."
    )
    user_prompt = textwrap.dedent(
        f"""\
        Based on the following file summaries, identify the most architecturally
        significant files. These files should represent the core logic,
        primary entry points, or key data structures of the project.

        Your response MUST be a comma-separated list of file paths, ordered from
        most to least architecturally significant. Do not add any other text.
        Please ensure that the file paths exactly match the file summaries 
        below.
        Example: `README.md`,`src/main.py,src/utils.py,src/models.py`

        ---
        FILE SUMMARIES
        {bullets}
        """
    )
    
    try:
        resp = client.chat.completions.create(
            model=LLM_MODEL,
            messages=[
                {"role": "system", "content": sys_prompt},
                {"role": "user", "content": user_prompt},
            ],
            temperature=0.1,
        )
        response = resp.completion_message.content.text
        
        # Parse the comma-separated list.
        if response:
            # Clean up the response to handle potential markdown code blocks
            cleaned_response = (response.strip()
                              .removeprefix("```")
                              .removesuffix("```")
                              .strip())
            return [f.strip() for f in cleaned_response.split(',') if f.strip()]
    except Exception as e:
        print(f"    Error selecting important files: {e}")
    return []

In [14]:
print("\n--- Selecting important files for deep analysis ---")
important_files = select_important_files(file_summaries)
if important_files:
    print(f"‚úÖ LLM selected {len(important_files)} files for analysis: "
          f"{important_files}")
else:
    print("‚ÑπÔ∏è No files were selected for architectural analysis.")


--- Selecting important files for deep analysis ---
‚úÖ LLM selected 6 files for analysis: ['llama/generation.py', 'llama/model.py', 'llama/__init__.py', 'llama/tokenizer.py', 'example_text_completion.py', 'example_chat_completion.py']


In [15]:
def token_estimate(text: str) -> int:
    """Estimates the token count of a text string using tiktoken."""
    enc = tiktoken.get_encoding("o200k_base")
    return len(enc.encode(text))

**Managing context for large repositories**

In large repositories, the combined size of important files can still exceed the model's context window. The code below uses a simple budgeting strategy: it collects file contents until a token limit is reached, ensuring the request doesn't fail.

For a production-grade system, a more sophisticated approach is recommended. For example, you could include the full content of the most critical files that fit, and supplement this with summaries of other important files to stay within the context limit.

In [None]:
# --- Get code for selected files ---
# The files are processed in order of importance as determined by the LLM, so
# that the most critical files are most likely to be included if we hit the
# context window budget.
snippets: List[Tuple[str, str]] = []
if important_files:
    print(f"\n--- Step 5: Retrieving code for {len(important_files)} "
          f"selected files ---")
    tokens_used = 0
    for file_name in important_files:
        # It's possible the model returns paths with leading/trailing whitespace
        file_name = file_name.strip()

        fp = extracted_root / file_name
        if not fp.is_file():
            print(f"‚ö†Ô∏è Selected path '{file_name}' is not a file, skipping.")
            continue

        try:
            # Limit file size to avoid huge token counts for single files
            code = fp.read_text(encoding="utf-8")[:20_000]
        except UnicodeDecodeError:
            continue

        token_count = token_estimate(code)

        # Reserve half of the context window for summaries and other prompt text
        if tokens_used + token_count > (CTX_WINDOW // 2):
            print(f"‚ö†Ô∏è  Context window budget reached. Stopping at "
                  f"{len(snippets)} files.")
            break

        snippets.append((file_name, code))
        tokens_used += token_count

    print(f"‚úÖ Retrieved content of {len(snippets)} files for deep analysis.")


--- Step 5: Retrieving code for 6 selected files ---
‚úÖ Retrieved content of 6 files for deep analysis.


**Deep Analysis Process**: Include full source code of selected files in context to generate:
- Mermaid class diagrams
- Component relationships  
- Architectural patterns
- README-ready documentation

In [None]:
# --- Cross-File Architectural Reasoning Function ---
MAX_COMPLETION_TOKENS_ARCH = 900 # Max tokens for architecture overview

def build_architecture(
    file_summaries: Dict[str, str], 
    code_snippets: List[Tuple[str, str]], 
    ctx_budget: int
) -> str:
    """Produces an Architecture & Key Concepts section using the large model."""
    summary_lines = "\n".join(f"- **{n}**: {s}" for n, s in file_summaries.items())
    prompt_sections = [
        "[[FILE_SUMMARIES]]",
        summary_lines,
        "[[/FILE_SUMMARIES]]",
    ]
    tokens_used = token_estimate("\n".join(prompt_sections))

    if code_snippets:
        code_block_lines = []
        for fname, code in code_snippets:
            added = "\n### " + fname + "\n```code\n" + code + "\n```\n"
            t = token_estimate(added)
            if tokens_used + t > (ctx_budget // 2):
                break
            code_block_lines.append(added)
            tokens_used += t
        if code_block_lines:
            prompt_sections.extend(
                ["[[RAW_CODE_SNIPPETS]]"] + code_block_lines + 
                ["[[/RAW_CODE_SNIPPETS]]"]
            )

    user_prompt = textwrap.dedent("\n".join(prompt_sections) + """
        ---
        **Your tasks**
        1. Identify the major abstractions (classes, services, data models) 
           across the entire codebase.
        2. Explain how they interact ‚Äì include dependencies, data flow, and any 
           cross-cutting concerns.
        3. Output a concise *Architecture & Key Concepts* section suitable for a 
           README, consisting of:
           ‚Ä¢ short Overview (‚â§ 3 sentences)
           ‚Ä¢ Mermaid diagram (`classDiagram` or `flowchart`) of components
           ‚Ä¢ bullet list of abstractions with brief descriptions.
        """)

    sys_prompt = (
        "You are a principal software architect. Use the provided file "
        "summaries (and raw code if present) to infer high-level design. "
        "Be precise and avoid guesswork."
    )
    
    try:
        resp = client.chat.completions.create(
            model=LLM_MODEL,
            messages=[
                {"role": "system", "content": sys_prompt},
                {"role": "user", "content": user_prompt},
            ],
            temperature=0.2,
            max_tokens=MAX_COMPLETION_TOKENS_ARCH,
        )
        return resp.completion_message.content.text
    except Exception as e:
        print(f"    Error creating architecture analysis: {e}")
        return ""

In [18]:
print("\n--- Performing cross-file architectural reasoning ---")
architecture_section = build_architecture(
    file_summaries, snippets, CTX_WINDOW
)
print("‚úÖ Architectural analysis complete.")


--- Performing cross-file architectural reasoning ---
‚úÖ Architectural analysis complete.


In [19]:
print(architecture_section)

## Architecture & Key Concepts

### Overview

The Llama project is a large language model implementation that provides a simple and efficient way to generate text based on given prompts. The project consists of several key components, including a Transformer-based model, a tokenizer, and a generation module. These components work together to enable text completion and chat completion tasks.

### Mermaid Diagram

```mermaid
classDiagram
    class Llama {
        +build(ckpt_dir, tokenizer_path, max_seq_len, max_batch_size)
        +text_completion(prompts, temperature, top_p, max_gen_len, logprobs, echo)
        +chat_completion(dialogs, temperature, top_p, max_gen_len, logprobs)
    }
    class Transformer {
        +forward(tokens, start_pos)
    }
    class Tokenizer {
        +encode(s, bos, eos)
        +decode(t)
    }
    class ModelArgs {
        +dim
        +n_layers
        +n_heads
        +n_kv_heads
        +vocab_size
        +multiple_of
        +ffn_dim_multiplier
     

## Step 5: Assemble final documentation

The final phase assembles all the AI-generated content into a single, comprehensive `README.md` file. The goal is to create a document that is not only informative but also easy for developers to navigate and use.

### Documentation structure

The generated README follows a layered approach that enables readers to consume information at their preferred level of detail.

1.  **Repository Summary**: A high-level overview gives developers an immediate understanding of the project's purpose.
2.  **Architecture and Key Concepts**: A deeper technical analysis, including a Mermaid diagram, helps developers understand how the system is designed.
3.  **File Summaries**: A detailed breakdown of each component provides granular information for those who need it.
4.  **Attribution**: A concluding note clarifies that the document was generated by AI, which provides transparency about its origin.

> **üéØ** The combination of Llama 4's code intelligence and large context window enables the automated generation of thorough, high-quality documentation that rivals manually-created content, requiring minimal human intervention.

In [20]:
OUTPUT_DIR = Path.cwd()
readme_path = OUTPUT_DIR / f"Generated_README_{extracted_root.name}.md"
print(f"\n‚úçÔ∏è Writing final README to {readme_path.resolve()}...")
with readme_path.open("w", encoding="utf-8") as fh:
    fh.write(f"# Repository Summary for `{extracted_root.name}`\n\n"
             f"{repo_overview}\n\n")
    fh.write("## Architecture & Key Concepts\n\n")
    fh.write(architecture_section.strip() + "\n\n")
    fh.write("## File Summaries\n\n")
    for n, s in sorted(file_summaries.items()):
        fh.write(f"- **{n}** ‚Äì {s}\n")
    fh.write(
        "\n---\n*This README was generated automatically using "
        "Meta's **Llama 4** models.*"
    )

print(f"\n\nüéâ Success! Documentation generated at: "
      f"{readme_path.resolve()}")


‚úçÔ∏è Writing final README to /Users/saip/Documents/GitHub/meta-documentation-shared/notebooks/Generated_README_llama-main.md...


üéâ Success! Documentation generated at: /Users/saip/Documents/GitHub/meta-documentation-shared/notebooks/Generated_README_llama-main.md


In [21]:
!cat $readme_path

# Repository Summary for `llama-main`

Here is a high-level overview for the root of a README.md:

## Overview

This repository provides a comprehensive framework for utilizing the Llama large language model, including model architecture, training data, and example usage. The project aims to facilitate the development of natural language processing applications, while promoting responsible use and community engagement. By providing a range of tools and resources, this repository enables developers and researchers to explore the capabilities and limitations of the Llama model. The repository is structured to support easy integration, modification, and extension of the model.

## Key Components

* **llama/generation.py**: Core logic for text generation using the Llama model
* **llama/model.py**: Transformer-based model architecture definition
* **llama/tokenizer.py**: Tokenizer class using SentencePiece for text encoding and decoding
* **example_text_completion.py**: Example usage of the

In [22]:
print(f"\n--- Cleaning up temporary directory {tmpdir} ---")
try:
    tmpdir_obj.cleanup()
    print("‚úÖ Cleanup complete.")
except Exception as e:
    print(f"‚ö†Ô∏è  Error during cleanup: {e}")


--- Cleaning up temporary directory /var/folders/sz/kf8w7j1x1v790jxs8k2gl72c0000gn/T/tmptwo_kdt5 ---
‚úÖ Cleanup complete.


## Next steps and upgrade paths

This tutorial provides a solid foundation for automated documentation generation. You can extend it in several ways for a production-grade application.

| Need                           | Recommended approach                                                                                                                                                                                                                            |
| :----------------------------- | :--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **Private repositories**       | For private GitHub repos, use authenticated requests with a personal access token. For GitLab or Bitbucket, adapt the download logic to their respective APIs. |
| **Multiple languages**         | Extend the `INCLUDE_EXTENSIONS` list and adjust prompts to handle language-specific documentation patterns. Consider using language-specific parsers for better code understanding. |
| **Incremental updates**        | Implement caching of file summaries with timestamps. Only reprocess files that have changed since the last run, significantly reducing API costs for large repositories. |
| **Custom documentation formats** | Adapt the final assembly phase to generate different formats such as API documentation, developer guides, or architecture decision records (ADRs). |
| **CI/CD integration**          | Run the documentation generator as part of your continuous integration pipeline to keep documentation automatically synchronized with code changes. |
| **Multi-repository analysis**  | Extend the pipeline to analyze dependencies and generate documentation for entire microservice architectures or monorepos. |
