1 year ago · 5248cb14ec
--- a/recipes/quickstart/NotebookLlama/README.md
+++ b/recipes/quickstart/NotebookLlama/README.md
@@ -23,6 +23,8 @@ Note 1: In Step 1, we prompt the 1B model to not modify the text or summarize it
 
				 
			
 
				 Note 2: For Step 2, you can also use `Llama-3.1-8B-Instruct` model, we recommend experimenting and trying if you see any differences. The 70B model was used here because it gave slightly more creative podcast transcripts for the tested examples.
			
 
				 
			
 
				+Note 3: For Step 4, please try to extend the approach with other models. These models were chosen based on a sample prompt and worked best, newer models might sound better. Please see [Notes](./TTS_Notes.md) for some of the sample tests.
			
 
				+
			
 
				 ### Detailed steps on running the notebook:
			
 
				 
			
 
				 Requirements: GPU server or an API provider for using 70B, 8B and 1B Llama models.
			
--- a/recipes/quickstart/NotebookLlama/Step-1
+++ b/recipes/quickstart/NotebookLlama/Step-1
@@ -2697,6 +2697,16 @@
 
				    ]
			
 
				   },
			
 
				   {
			
 
				+   "cell_type": "markdown",
			
 
				+   "id": "3d996ac5",
			
 
				+   "metadata": {},
			
 
				+   "source": [
			
 
				+    "### Next Notebook: Transcript Writer\n",
			
 
				+    "\n",
			
 
				+    "Now that we have the pre-processed text ready, we can move to converting into a transcript in the next notebook"
			
 
				+   ]
			
 
				+  },
			
 
				+  {
			
 
				    "cell_type": "code",
			
 
				    "execution_count": null,
			
 
				    "id": "1b16ae0e-04cf-4eb9-a369-dee1728b89ce",
			
--- a/recipes/quickstart/NotebookLlama/Step-2-Transcript-Writer.ipynb
+++ b/recipes/quickstart/NotebookLlama/Step-2-Transcript-Writer.ipynb
@@ -303,6 +303,16 @@
 
				    ]
			
 
				   },
			
 
				   {
			
 
				+   "cell_type": "markdown",
			
 
				+   "id": "dbae9411",
			
 
				+   "metadata": {},
			
 
				+   "source": [
			
 
				+    "### Next Notebook: Transcript Re-writer\n",
			
 
				+    "\n",
			
 
				+    "We now have a working transcript but we can try making it more dramatic and natural. In the next notebook, we will use `Llama-3.1-8B-Instruct` model to do so."
			
 
				+   ]
			
 
				+  },
			
 
				+  {
			
 
				    "cell_type": "code",
			
 
				    "execution_count": null,
			
 
				    "id": "d9bab2f2-f539-435a-ae6a-3c9028489628",
			
--- a/recipes/quickstart/NotebookLlama/Step-3-Re-Writer.ipynb
+++ b/recipes/quickstart/NotebookLlama/Step-3-Re-Writer.ipynb
@@ -254,6 +254,16 @@
 
				    ]
			
 
				   },
			
 
				   {
			
 
				+   "cell_type": "markdown",
			
 
				+   "id": "2dccf336",
			
 
				+   "metadata": {},
			
 
				+   "source": [
			
 
				+    "### Next Notebook: TTS Workflow\n",
			
 
				+    "\n",
			
 
				+    "Now that we have our transcript ready, we are ready to generate the audio in the next notebook."
			
 
				+   ]
			
 
				+  },
			
 
				+  {
			
 
				    "cell_type": "code",
			
 
				    "execution_count": null,
			
 
				    "id": "21c7e456-497b-4080-8b52-6f399f9f8d58",
			
--- a/recipes/quickstart/NotebookLlama/Step-4-TTS-Workflow.ipynb
+++ b/recipes/quickstart/NotebookLlama/Step-4-TTS-Workflow.ipynb
@@ -11,7 +11,9 @@
 
				     "\n",
			
 
				     "In this notebook, we will learn how to generate Audio using both `suno/bark` and `parler-tts/parler-tts-mini-v1` models first. \n",
			
 
				     "\n",
			
 
				-    "After that, we will use the output from Notebook 3 to generate our complete podcast"
			
 
				+    "After that, we will use the output from Notebook 3 to generate our complete podcast\n",
			
 
				+    "\n",
			
 
				+    "Note: Please feel free to extend this notebook with newer models. The above two were chosen after some tests using a sample prompt."
			
 
				    ]
			
 
				   },
			
 
				   {
			
@@ -117,11 +119,7 @@
 
				    "id": "50b62df5-5ea3-4913-832a-da59f7cf8de2",
			
 
				    "metadata": {},
			
 
				    "source": [
			
 
				-    "Generally in life, you set your device to \"cuda\" and are happy. \n",
			
 
				-    "\n",
			
 
				-    "However, sometimes you want to compensate for things and set it to `cuda:7` to tell the system but even more-so the world that you have 8 GPUS.\n",
			
 
				-    "\n",
			
 
				-    "Jokes aside please set `device = \"cuda\"` below if you're using a single GPU node."
			
 
				+    "Please set `device = \"cuda\"` below if you're using a single GPU node."
			
 
				    ]
			
 
				   },
			
 
				   {
			
@@ -161,7 +159,7 @@
 
				    ],
			
 
				    "source": [
			
 
				     "# Set up device\n",
			
 
				-    "device = \"cuda:7\" if torch.cuda.is_available() else \"cpu\"\n",
			
 
				+    "device = \"cuda\" if torch.cuda.is_available() else \"cpu\"\n",
			
 
				     "\n",
			
 
				     "# Load model and tokenizer\n",
			
 
				     "model = ParlerTTSForConditionalGeneration.from_pretrained(\"parler-tts/parler-tts-mini-v1\").to(device)\n",
			
@@ -640,6 +638,19 @@
 
				    ]
			
 
				   },
			
 
				   {
			
 
				+   "cell_type": "markdown",
			
 
				+   "id": "c7ce5836",
			
 
				+   "metadata": {},
			
 
				+   "source": [
			
 
				+    "### Suggested Next Steps:\n",
			
 
				+    "\n",
			
 
				+    "- Experiment with the prompts: Please feel free to experiment with the SYSTEM_PROMPT in the notebooks\n",
			
 
				+    "- Extend workflow beyond two speakers\n",
			
 
				+    "- Test other TTS Models\n",
			
 
				+    "- Experiment with Speech Enhancer models as a step 5."
			
 
				+   ]
			
 
				+  },
			
 
				+  {
			
 
				    "cell_type": "code",
			
 
				    "execution_count": null,
			
 
				    "id": "26cc56c5-b9c9-47c2-b860-0ea9f05c79af",