10 mesiacov pred · 4a33b9bdd0
--- a/end-to-end-use-cases/technical_blogger/readme.md
+++ b/end-to-end-use-cases/technical_blogger/readme.md
@@ -76,93 +76,10 @@ cd llama-cookbook/end-to-end-use-cases/technical_blogger
 
				 
			
 
				 
			
 
				 Step 2: Set Up Your Python Environment
			
 
				-It's highly recommended to use a virtual environment to manage dependencies:
			
 
				-
			
 
				-Bash
			
 
				-
			
 
				-python -m venv .venv
			
 
				-source .venv/bin/activate # On Windows: .venv\Scripts\activate
			
 
				-pip install -r requirements.txt
			
 
				-Note: Ensure you have a requirements.txt file in your technical_blogger directory listing all necessary libraries, such as qdrant-client, sentence-transformers, requests, python-dotenv, IPython, etc. If not, create one manually based on your code's imports.
			
 
				 
			
 
				 Step 3: Configure Your API Key
			
 
				-For security, your Llama API key must be stored as an environment variable and not directly in your code.
			
 
				-
			
 
				-Create a .env file: In the root of the technical_blogger directory, create a new file named .env.
			
 
				-
			
 
				-Add your Llama API Key: Open the .env file and add your Llama API key in the following format:
			
 
				-
			
 
				-LLAMA_API_KEY="YOUR_LLAMA_API_KEY_HERE"
			
 
				-Replace "YOUR_LLAMA_API_KEY_HERE" with your actual API key.
			
 
				-
			
 
				-Add .env to .gitignore: To prevent accidentally committing your API key, ensure .env is listed in your .gitignore file. If you don't have one, create it and add the line /.env.
			
 
				 
			
 
				 Step 4: Prepare Your Knowledge Base (Data Ingestion)
			
 
				-This recipe uses an in-memory Qdrant database, meaning the knowledge base is built each time the script runs. You will need to provide your technical documentation for ingestion.
			
 
				-
			
 
				-Locate generate_blog function: Open the Technical_Blog_Generator.ipynb (or your main Python script if you converted it) and find the generate_blog function.
			
 
				-
			
 
				-Update ingest_data_into_qdrant call: Inside generate_blog, there's a section for data ingestion:
			
 
				-
			
 
				-Python
			
 
				-
			
 
				-# IMPORTANT: For in-memory Qdrant, you MUST ingest your data every time
			
 
				-# the script runs or the client is initialized, as it's not persistent.
			
 
				-# Replace this with your actual data loading and chunking.
			
 
				-# Example placeholder data:
			
 
				-example_data_chunks = [
			
 
				-    # ... your example data ...
			
 
				-]
			
 
				-ingest_data_into_qdrant(client, MAIN_COLLECTION_NAME, embedding_model, example_data_chunks)
			
 
				-Replace example_data_chunks with your actual code to load your technical documentation (e.g., from mdfiles_latest.txt, 3rd_party_integrations.txt, etc.), chunk it appropriately, and pass it to the ingest_data_into_qdrant function. This step defines the content that Llama will retrieve and use.
			
 
				-
			
 
				-Example (conceptual - adapt to your file loading logic):
			
 
				-
			
 
				-Python
			
 
				-
			
 
				-# Assuming your raw text files are in a 'cookbook_metadata' folder
			
 
				-# and you have a function to read and chunk them.
			
 
				-from your_data_loader_module import load_and_chunk_docs # You need to implement this
			
 
				-
			
 
				-all_your_technical_docs_chunks = []
			
 
				-# Load from mdfiles_latest.txt, 3rd_party_integrations.txt, Getting_started_files.txt
			
 
				-# and split into chunks suitable for embedding.
			
 
				-# Example:
			
 
				-# with open('cookbook_metadata/mdfiles_latest.txt', 'r') as f:
			
 
				-#     content = f.read()
			
 
				-#     all_your_technical_docs_chunks.extend(your_chunking_function(content))
			
 
				-# ... repeat for other files ...
			
 
				 
			
 
				-ingest_data_into_qdrant(client, MAIN_COLLECTION_NAME, embedding_model, all_your_technical_docs_chunks)
			
 
				 Step 5: Run the Notebook
			
 
				-With your environment configured and data ingestion prepared, you can now open the Jupyter notebook and run the blog generator.
			
 
				-
			
 
				-Start Jupyter:
			
 
				-
			
 
				-Bash
			
 
				-
			
 
				-jupyter notebook
			
 
				-Open the Notebook: In your browser, navigate to the technical_blogger folder and open Technical_Blog_Generator.ipynb.
			
 
				-
			
 
				-Run Cells: Execute each cell in the notebook sequentially. This will:
			
 
				-
			
 
				-Initialize the Llama API client.
			
 
				-
			
 
				-Set up the in-memory Qdrant database and ingest your provided knowledge base.
			
 
				-
			
 
				-Load helper functions for querying.
			
 
				-
			
 
				-Allow you to specify a blog topic.
			
 
				-
			
 
				-Trigger the RAG process to generate and display the blog post.
			
 
				-
			
 
				-Customization
			
 
				-Knowledge Base: Expand your knowledge base by adding more technical documentation files. Remember to update the data ingestion logic in generate_blog (Step 4) to include these new sources.
			
 
				-
			
 
				-LLM Model: Experiment with different Llama models by changing the LLAMA_MODEL variable in the configuration.
			
 
				-
			
 
				-Prompt Engineering: Modify the system_prompt within the generate_blog function to control the tone, structure, depth, and specific requirements for your generated blog posts.
			
 
				-
			
 
				-RAG Parameters: Adjust top_k in the query_qdrant function to retrieve more or fewer relevant chunks. You can also experiment with different embedding models or reranking models.
			
 
				 
			
 
				-Output Format: Customize the output formatting if you need something other than Markdown.