Suraj Subramanian cd31ee99ac * Add logging 1 рік тому
..
output cd31ee99ac * Add logging 1 рік тому
README.md cd31ee99ac * Add logging 1 рік тому
config.yaml cd31ee99ac * Add logging 1 рік тому
llm.py cd31ee99ac * Add logging 1 рік тому
pdf_report.py cd31ee99ac * Add logging 1 рік тому
plots.py cd31ee99ac * Add logging 1 рік тому
requirements.txt a5ddea144a Add auto triage tooll 1 рік тому
triage.py cd31ee99ac * Add logging 1 рік тому
utils.py cd31ee99ac * Add logging 1 рік тому
walkthrough.ipynb cd31ee99ac * Add logging 1 рік тому

README.md

Automatic Issues Triaging with Llama

This tool utilizes an off-the-shelf Llama model to analyze, generate insights, and create a report for better understanding of the state of a repository. It serves as a reference implementation for using Llama to develop custom reporting and data analytics applications.

Features

The tool performs the following tasks:

  • Fetches issue threads from a specified repository
  • Analyzes issue discussions and generates annotations such as category, severity, component affected, etc.
  • Categorizes all issues by theme
  • Synthesizes key challenges faced by users, along with probable causes and remediations
  • Generates a high-level executive summary providing insights on diagnosing and improving the developer experience

For a step-by-step look, check out the walkthrough notebook.

Getting Started

Installation

pip install -r requirements.txt

Setup

  1. API Keys and Model Service: Set your GitHub token for API calls. Some privileged information may not be available if you don't have push-access to the target repository.
  2. Model Configuration: Set the appropriate values in the model section of config.yaml for using Llama via VLLM or Groq.
  3. JSON Schemas: Edit the output JSON schemas in config.yaml to ensure consistency in outputs. VLLM supports JSON-decoding via the guided_json generation argument, while Groq requires passing the schema in the system prompt.

Running the Tool

python triage.py --repo_name='meta-llama/llama-recipes' --start_date='2024-08-14' --end_date='2024-08-27'

Output

The tool generates:

  • CSV files with annotations, challenges, and overview data, which can be persisted in SQL tables for downstream analyses and reporting.
  • Graphical matplotlib plots of repository traffic, maintenance activity, and issue attributes.
  • A PDF report for easier reading and sharing.

Config

The tool's configuration is stored in config.yaml. The following sections can be edited:

  • Github Token: Use a token that has push-access on the target repo.
  • model: Specify the model service (vllm or groq) and set the endpoints and API keys as applicable.
  • prompts: For each of the 3 tasks Llama does in this tool, we specify a prompt and an output JSON schema:
    • parse_issue: Parsing and generating annotations for the issues
    • assign_category: Assigns each issue to a category specified in an enum in the corresponding JSON schema
    • get_overview: Generates a high-level executive summary and analysis of all the parsed and generated data

Troubleshooting

  • If you encounter issues with API calls, ensure that your GitHub token is set correctly and that you have the necessary permissions.
  • If you encounter issues with the model service, check the configuration values in config.yaml.