Sanyam Bhutani před 1 měsícem
rodič
revize
f06722ee8e

+ 6 - 4
end-to-end-use-cases/NotebookLlama/README.md

@@ -1,4 +1,6 @@
-## NotebookLlama: An Open Source version of NotebookLM
+## NotebookLlama: PDF to Podcast using Llama models
+
+> Note: We have updated this to support Llama API, sign up [here](http://llama.com)
 
 ![NotebookLlama](./resources/Outline.jpg)
 
@@ -15,13 +17,13 @@ It assumes zero knowledge of LLMs, prompting and audio models, everything is cov
 Here is step by step thought (pun intended) for the task:
 
 - Step 1: Pre-process PDF: Use `Llama-3.2-1B-Instruct` to pre-process the PDF and save it in a `.txt` file.
-- Step 2: Transcript Writer: Use `Llama-3.1-70B-Instruct` model to write a podcast transcript from the text
-- Step 3: Dramatic Re-Writer: Use `Llama-3.1-8B-Instruct` model to make the transcript more dramatic
+- Step 2: Transcript Writer: Use `Llama-4-Maverick` model to write a podcast transcript from the text
+- Step 3: Dramatic Re-Writer: Use `Llama-3-8B-Instruct` model to make the transcript more dramatic
 - Step 4: Text-To-Speech Workflow: Use `parler-tts/parler-tts-mini-v1` and `bark/suno` to generate a conversational podcast
 
 Note 1: In Step 1, we prompt the 1B model to not modify the text or summarize it, strictly clean up extra characters or garbage characters that might get picked due to encoding from PDF. Please see the prompt in Notebook 1 for more details.
 
-Note 2: For Step 2, you can also use `Llama-3.1-8B-Instruct` model, we recommend experimenting and trying if you see any differences. The 70B model was used here because it gave slightly more creative podcast transcripts for the tested examples.
+Note 2: For Step 2, you can also use `Llama-3-8B-Instruct` model, we recommend experimenting and trying if you see any differences. The 70B model was used here because it gave slightly more creative podcast transcripts for the tested examples.
 
 Note 3: For Step 4, please try to extend the approach with other models. These models were chosen based on a sample prompt and worked best, newer models might sound better. Please see [Notes](./TTS_Notes.md) for some of the sample tests.
 

Rozdílová data souboru nebyla zobrazena, protože soubor je příliš velký
+ 196 - 2114
end-to-end-use-cases/NotebookLlama/Step-1 PDF-Pre-Processing-Logic.ipynb


Rozdílová data souboru nebyla zobrazena, protože soubor je příliš velký
+ 71 - 76
end-to-end-use-cases/NotebookLlama/Step-2-Transcript-Writer.ipynb


Rozdílová data souboru nebyla zobrazena, protože soubor je příliš velký
+ 66 - 56
end-to-end-use-cases/NotebookLlama/Step-3-Re-Writer.ipynb


Rozdílová data souboru nebyla zobrazena, protože soubor je příliš velký
+ 143 - 89
end-to-end-use-cases/NotebookLlama/Step-4-TTS-Workflow.ipynb


binární
end-to-end-use-cases/NotebookLlama/resources/2402.13116v4.pdf


binární
end-to-end-use-cases/NotebookLlama/resources/2407.21783v3.pdf


binární
end-to-end-use-cases/NotebookLlama/resources/Outline.jpg


Rozdílová data souboru nebyla zobrazena, protože soubor je příliš velký
+ 242 - 74
end-to-end-use-cases/NotebookLlama/resources/clean_extracted_text.txt


binární
end-to-end-use-cases/NotebookLlama/resources/data.pkl


Rozdílová data souboru nebyla zobrazena, protože soubor je příliš velký
+ 1162 - 0
end-to-end-use-cases/NotebookLlama/resources/extracted_text.txt


binární
end-to-end-use-cases/NotebookLlama/resources/podcast_ready_data.pkl