Browse Source
Update README.md
What is this Python project?
https://github.com/Dicklesworthstone/swiss_army_llama
The Swiss Army Llama is a comprehensive toolkit for working with local Large Language Models (LLMs). It uses FastAPI to provide REST endpoints for a variety of tasks, including text embeddings, completions, semantic similarity measurements, and more. It supports a wide range of document types, including those requiring OCR, and audio files for transcription. Embeddings are cached in SQLite for efficiency, and RAM Disks are used for quick loading of multiple LLMs. The toolkit is accessible via a Swagger UI and integrates with various technologies like FAISS for semantic search and the Fast Vector Similarity library for advanced similarity measures.
What's the difference between this Python project and similar ones?
1. FastAPI Integration: Swiss Army Llama is fully integrated with FastAPI, providing a Swagger page for easy access and interaction with its REST API, a feature not commonly found in similar projects.
2. Comprehensive Caching: The project implements automatic caching for all processes, significantly enhancing efficiency and reducing redundant computations. This level of automatic caching is not standard in similar tools.
3. RAM Disk Utilization: It uniquely uses RAM Disk to accelerate the loading of models, providing a substantial speed advantage in accessing and using various LLMs.
4. Broad File Format Support with Textract: The integration with Textract allows the Swiss Army Llama to handle a wide array of file formats, far exceeding the capabilities of similar projects which often have limited file format support.
5. Integration with Whisper: The project is integrated with OpenAI's Whisper model for advanced audio transcription, a feature that is not typically included in similar LLM toolkits.
6. BNF Grammar Tools: It includes specialized tools for working with Backus-Naur Form (BNF) grammars, offering unique capabilities in generating structured LLM outputs based on specific grammar rules.
7. Support for Token-Level Embeddings: In addition to standard embeddings, Swiss Army Llama supports token-level embeddings, providing more nuanced data representation and analysis. This level of detail in embeddings is not commonly available in similar projects.
Features:
Versatile File Processing: Supports an extensive array of file types including PDFs, Word documents, images, and audio files, with advanced text preprocessing and OCR capabilities.
Comprehensive Embedding Features: Offers both fixed-sized and token-level embeddings, with the unique introduction of combined feature vectors for comparing strings of unequal length.
Advanced Semantic Search: Combines FAISS vector searching with sophisticated similarity measures like spearman_rho, kendall_tau, and jensen_shannon_similarity, enabling nuanced text comparisons.
Efficient Caching and RAM Disk Usage: Implements efficient caching in SQLite and optional RAM Disk usage for faster model loading and execution.
Comprehensive Logging and Real-Time Monitoring: Features a real-time log file viewer in the browser, and uses Redis for efficient request handling and logging.
Interactive and User-Friendly Interface: Integrates with Swagger UI for easy access and management of large result sets, making it user-friendly and easily integrable into applications.
Customizable and Scalable: Built on FastAPI, it is highly scalable and customizable with configurable settings for response formats and parallel inference.
Multiple Model and Measure Support: Accommodates a variety of models and measures, providing flexibility and customization according to user needs.
Specialized Grammar-Enforced Text Completions: Offers the ability to generate multiple text completions with specified grammar, enhancing structured LLM output.