|
@@ -76,93 +76,10 @@ cd llama-cookbook/end-to-end-use-cases/technical_blogger
|
|
|
|
|
|
|
|
|
Step 2: Set Up Your Python Environment
|
|
|
-It's highly recommended to use a virtual environment to manage dependencies:
|
|
|
-
|
|
|
-Bash
|
|
|
-
|
|
|
-python -m venv .venv
|
|
|
-source .venv/bin/activate # On Windows: .venv\Scripts\activate
|
|
|
-pip install -r requirements.txt
|
|
|
-Note: Ensure you have a requirements.txt file in your technical_blogger directory listing all necessary libraries, such as qdrant-client, sentence-transformers, requests, python-dotenv, IPython, etc. If not, create one manually based on your code's imports.
|
|
|
|
|
|
Step 3: Configure Your API Key
|
|
|
-For security, your Llama API key must be stored as an environment variable and not directly in your code.
|
|
|
-
|
|
|
-Create a .env file: In the root of the technical_blogger directory, create a new file named .env.
|
|
|
-
|
|
|
-Add your Llama API Key: Open the .env file and add your Llama API key in the following format:
|
|
|
-
|
|
|
-LLAMA_API_KEY="YOUR_LLAMA_API_KEY_HERE"
|
|
|
-Replace "YOUR_LLAMA_API_KEY_HERE" with your actual API key.
|
|
|
-
|
|
|
-Add .env to .gitignore: To prevent accidentally committing your API key, ensure .env is listed in your .gitignore file. If you don't have one, create it and add the line /.env.
|
|
|
|
|
|
Step 4: Prepare Your Knowledge Base (Data Ingestion)
|
|
|
-This recipe uses an in-memory Qdrant database, meaning the knowledge base is built each time the script runs. You will need to provide your technical documentation for ingestion.
|
|
|
-
|
|
|
-Locate generate_blog function: Open the Technical_Blog_Generator.ipynb (or your main Python script if you converted it) and find the generate_blog function.
|
|
|
-
|
|
|
-Update ingest_data_into_qdrant call: Inside generate_blog, there's a section for data ingestion:
|
|
|
-
|
|
|
-Python
|
|
|
-
|
|
|
-# IMPORTANT: For in-memory Qdrant, you MUST ingest your data every time
|
|
|
-# the script runs or the client is initialized, as it's not persistent.
|
|
|
-# Replace this with your actual data loading and chunking.
|
|
|
-# Example placeholder data:
|
|
|
-example_data_chunks = [
|
|
|
- # ... your example data ...
|
|
|
-]
|
|
|
-ingest_data_into_qdrant(client, MAIN_COLLECTION_NAME, embedding_model, example_data_chunks)
|
|
|
-Replace example_data_chunks with your actual code to load your technical documentation (e.g., from mdfiles_latest.txt, 3rd_party_integrations.txt, etc.), chunk it appropriately, and pass it to the ingest_data_into_qdrant function. This step defines the content that Llama will retrieve and use.
|
|
|
-
|
|
|
-Example (conceptual - adapt to your file loading logic):
|
|
|
-
|
|
|
-Python
|
|
|
-
|
|
|
-# Assuming your raw text files are in a 'cookbook_metadata' folder
|
|
|
-# and you have a function to read and chunk them.
|
|
|
-from your_data_loader_module import load_and_chunk_docs # You need to implement this
|
|
|
-
|
|
|
-all_your_technical_docs_chunks = []
|
|
|
-# Load from mdfiles_latest.txt, 3rd_party_integrations.txt, Getting_started_files.txt
|
|
|
-# and split into chunks suitable for embedding.
|
|
|
-# Example:
|
|
|
-# with open('cookbook_metadata/mdfiles_latest.txt', 'r') as f:
|
|
|
-# content = f.read()
|
|
|
-# all_your_technical_docs_chunks.extend(your_chunking_function(content))
|
|
|
-# ... repeat for other files ...
|
|
|
|
|
|
-ingest_data_into_qdrant(client, MAIN_COLLECTION_NAME, embedding_model, all_your_technical_docs_chunks)
|
|
|
Step 5: Run the Notebook
|
|
|
-With your environment configured and data ingestion prepared, you can now open the Jupyter notebook and run the blog generator.
|
|
|
-
|
|
|
-Start Jupyter:
|
|
|
-
|
|
|
-Bash
|
|
|
-
|
|
|
-jupyter notebook
|
|
|
-Open the Notebook: In your browser, navigate to the technical_blogger folder and open Technical_Blog_Generator.ipynb.
|
|
|
-
|
|
|
-Run Cells: Execute each cell in the notebook sequentially. This will:
|
|
|
-
|
|
|
-Initialize the Llama API client.
|
|
|
-
|
|
|
-Set up the in-memory Qdrant database and ingest your provided knowledge base.
|
|
|
-
|
|
|
-Load helper functions for querying.
|
|
|
-
|
|
|
-Allow you to specify a blog topic.
|
|
|
-
|
|
|
-Trigger the RAG process to generate and display the blog post.
|
|
|
-
|
|
|
-Customization
|
|
|
-Knowledge Base: Expand your knowledge base by adding more technical documentation files. Remember to update the data ingestion logic in generate_blog (Step 4) to include these new sources.
|
|
|
-
|
|
|
-LLM Model: Experiment with different Llama models by changing the LLAMA_MODEL variable in the configuration.
|
|
|
-
|
|
|
-Prompt Engineering: Modify the system_prompt within the generate_blog function to control the tone, structure, depth, and specific requirements for your generated blog posts.
|
|
|
-
|
|
|
-RAG Parameters: Adjust top_k in the query_qdrant function to retrieve more or fewer relevant chunks. You can also experiment with different embedding models or reranking models.
|
|
|
|
|
|
-Output Format: Customize the output formatting if you need something other than Markdown.
|