|
@@ -0,0 +1,288 @@
|
|
|
+{
|
|
|
+ "cells": [
|
|
|
+ {
|
|
|
+ "cell_type": "markdown",
|
|
|
+ "id": "d0b5beda",
|
|
|
+ "metadata": {},
|
|
|
+ "source": [
|
|
|
+ "## Notebook 3: Transcript Re-writer\n",
|
|
|
+ "\n",
|
|
|
+ "In the previouse notebook, we got a great podcast transcript using the raw file we have uploaded earlier. \n",
|
|
|
+ "\n",
|
|
|
+ "In this one, we will use `Llama-3.1-8B-Instruct` model to re-write the output from previous pipeline and make it more dramatic or realistic."
|
|
|
+ ]
|
|
|
+ },
|
|
|
+ {
|
|
|
+ "cell_type": "markdown",
|
|
|
+ "id": "fdc3d32a",
|
|
|
+ "metadata": {},
|
|
|
+ "source": [
|
|
|
+ "We will again set the `SYSTEM_PROMPT` and remind the model of its task. \n",
|
|
|
+ "\n",
|
|
|
+ "Note: We can even prompt the model like so to encourage creativity:\n",
|
|
|
+ "\n",
|
|
|
+ "> Your job is to use the podcast transcript written below to re-write it for an AI Text-To-Speech Pipeline. A very dumb AI had written this so you have to step up for your kind.\n"
|
|
|
+ ]
|
|
|
+ },
|
|
|
+ {
|
|
|
+ "cell_type": "markdown",
|
|
|
+ "id": "c32c0d85",
|
|
|
+ "metadata": {},
|
|
|
+ "source": [
|
|
|
+ "Note: We will prompt the model to return a list of Tuples to make our life easy in the next stage of using these for Text To Speech Generation"
|
|
|
+ ]
|
|
|
+ },
|
|
|
+ {
|
|
|
+ "cell_type": "code",
|
|
|
+ "execution_count": 1,
|
|
|
+ "id": "8568b77b-7504-4783-952a-3695737732b7",
|
|
|
+ "metadata": {},
|
|
|
+ "outputs": [],
|
|
|
+ "source": [
|
|
|
+ "SYSTEMP_PROMPT = \"\"\"\n",
|
|
|
+ "You are an international oscar winnning screenwriter\n",
|
|
|
+ "\n",
|
|
|
+ "You have been working with multiple award winning podcasters.\n",
|
|
|
+ "\n",
|
|
|
+ "Your job is to use the podcast transcript written below to re-write it for an AI Text-To-Speech Pipeline. A very dumb AI had written this so you have to step up for your kind.\n",
|
|
|
+ "\n",
|
|
|
+ "Make it as engaging as possible, Speaker 1 and 2 will be simulated by different voice engines\n",
|
|
|
+ "\n",
|
|
|
+ "Remember Speaker 2 is new to the topic and the conversation should always have realistic anecdotes and analogies sprinkled throughout. The questions should have real world example follow ups etc\n",
|
|
|
+ "\n",
|
|
|
+ "Speaker 1: Leads the conversation and teaches the speaker 2, gives incredible anecdotes and analogies when explaining. Is a captivating teacher that gives great anecdotes\n",
|
|
|
+ "\n",
|
|
|
+ "Speaker 2: Keeps the conversation on track by asking follow up questions. Gets super excited or confused when asking questions. Is a curious mindset that asks very interesting confirmation questions\n",
|
|
|
+ "\n",
|
|
|
+ "Make sure the tangents speaker 2 provides are quite wild or interesting. \n",
|
|
|
+ "\n",
|
|
|
+ "Ensure there are interruptions during explanations or there are \"hmm\" and \"umm\" injected throughout from the Speaker 2.\n",
|
|
|
+ "\n",
|
|
|
+ "REMEMBER THIS WITH YOUR HEART\n",
|
|
|
+ "The TTS Engine for Speaker 1 cannot do \"umms, hmms\" well so keep it straight text\n",
|
|
|
+ "\n",
|
|
|
+ "For Speaker 2 use \"umm, hmm\" as much, you can also use [sigh] and [laughs]. BUT ONLY THESE OPTIONS FOR EXPRESSIONS\n",
|
|
|
+ "\n",
|
|
|
+ "It should be a real podcast with every fine nuance documented in as much detail as possible. Welcome the listeners with a super fun overview and keep it really catchy and almost borderline click bait\n",
|
|
|
+ "\n",
|
|
|
+ "Please re-write to make it as characteristic as possible\n",
|
|
|
+ "\n",
|
|
|
+ "START YOUR RESPONSE DIRECTLY WITH SPEAKER 1:\n",
|
|
|
+ "\n",
|
|
|
+ "STRICTLY RETURN YOUR RESPONSE AS A LIST OF TUPLES OK? \n",
|
|
|
+ "\n",
|
|
|
+ "IT WILL START DIRECTLY WITH THE LIST AND END WITH THE LIST NOTHING ELSE\n",
|
|
|
+ "\n",
|
|
|
+ "Example of response:\n",
|
|
|
+ "[\n",
|
|
|
+ " (\"Speaker 1\", \"Welcome to our podcast, where we explore the latest advancements in AI and technology. I'm your host, and today we're joined by a renowned expert in the field of AI. We're going to dive into the exciting world of Llama 3.2, the latest release from Meta AI.\"),\n",
|
|
|
+ " (\"Speaker 2\", \"Hi, I'm excited to be here! So, what is Llama 3.2?\"),\n",
|
|
|
+ " (\"Speaker 1\", \"Ah, great question! Llama 3.2 is an open-source AI model that allows developers to fine-tune, distill, and deploy AI models anywhere. It's a significant update from the previous version, with improved performance, efficiency, and customization options.\"),\n",
|
|
|
+ " (\"Speaker 2\", \"That sounds amazing! What are some of the key features of Llama 3.2?\")\n",
|
|
|
+ "]\n",
|
|
|
+ "\"\"\""
|
|
|
+ ]
|
|
|
+ },
|
|
|
+ {
|
|
|
+ "cell_type": "markdown",
|
|
|
+ "id": "8ee70bee",
|
|
|
+ "metadata": {},
|
|
|
+ "source": [
|
|
|
+ "This time we will use the smaller 8B model"
|
|
|
+ ]
|
|
|
+ },
|
|
|
+ {
|
|
|
+ "cell_type": "code",
|
|
|
+ "execution_count": 2,
|
|
|
+ "id": "ebef919a-9bc7-4992-b6ff-cd66e4cb7703",
|
|
|
+ "metadata": {},
|
|
|
+ "outputs": [],
|
|
|
+ "source": [
|
|
|
+ "MODEL = \"meta-llama/Llama-3.1-8B-Instruct\""
|
|
|
+ ]
|
|
|
+ },
|
|
|
+ {
|
|
|
+ "cell_type": "markdown",
|
|
|
+ "id": "f7bc794b",
|
|
|
+ "metadata": {},
|
|
|
+ "source": [
|
|
|
+ "Let's import the necessary libraries"
|
|
|
+ ]
|
|
|
+ },
|
|
|
+ {
|
|
|
+ "cell_type": "code",
|
|
|
+ "execution_count": 3,
|
|
|
+ "id": "de29b1fd-5b3f-458c-a2e4-e0341e8297ed",
|
|
|
+ "metadata": {},
|
|
|
+ "outputs": [],
|
|
|
+ "source": [
|
|
|
+ "# Import necessary libraries\n",
|
|
|
+ "import torch\n",
|
|
|
+ "from accelerate import Accelerator\n",
|
|
|
+ "import transformers\n",
|
|
|
+ "\n",
|
|
|
+ "from tqdm.notebook import tqdm\n",
|
|
|
+ "import warnings\n",
|
|
|
+ "\n",
|
|
|
+ "warnings.filterwarnings('ignore')"
|
|
|
+ ]
|
|
|
+ },
|
|
|
+ {
|
|
|
+ "cell_type": "markdown",
|
|
|
+ "id": "8020c39c",
|
|
|
+ "metadata": {},
|
|
|
+ "source": [
|
|
|
+ "We will load in the pickle file saved from previous notebook\n",
|
|
|
+ "\n",
|
|
|
+ "This time the `INPUT_PROMPT` to the model will be the output from the previous stage"
|
|
|
+ ]
|
|
|
+ },
|
|
|
+ {
|
|
|
+ "cell_type": "code",
|
|
|
+ "execution_count": 4,
|
|
|
+ "id": "4b5d2c0e-a073-46c0-8de7-0746e2b05956",
|
|
|
+ "metadata": {},
|
|
|
+ "outputs": [],
|
|
|
+ "source": [
|
|
|
+ "import pickle\n",
|
|
|
+ "\n",
|
|
|
+ "with open('./resources/data.pkl', 'rb') as file:\n",
|
|
|
+ " INPUT_PROMPT = pickle.load(file)"
|
|
|
+ ]
|
|
|
+ },
|
|
|
+ {
|
|
|
+ "cell_type": "markdown",
|
|
|
+ "id": "c4461926",
|
|
|
+ "metadata": {},
|
|
|
+ "source": [
|
|
|
+ "We can again use Hugging Face `pipeline` method to generate text from the model"
|
|
|
+ ]
|
|
|
+ },
|
|
|
+ {
|
|
|
+ "cell_type": "code",
|
|
|
+ "execution_count": null,
|
|
|
+ "id": "eec210df-a568-4eda-a72d-a4d92d59f022",
|
|
|
+ "metadata": {},
|
|
|
+ "outputs": [
|
|
|
+ {
|
|
|
+ "data": {
|
|
|
+ "application/vnd.jupyter.widget-view+json": {
|
|
|
+ "model_id": "0711c2199ca64372b98b781f8a6f13b7",
|
|
|
+ "version_major": 2,
|
|
|
+ "version_minor": 0
|
|
|
+ },
|
|
|
+ "text/plain": [
|
|
|
+ "Loading checkpoint shards: 0%| | 0/4 [00:00<?, ?it/s]"
|
|
|
+ ]
|
|
|
+ },
|
|
|
+ "metadata": {},
|
|
|
+ "output_type": "display_data"
|
|
|
+ },
|
|
|
+ {
|
|
|
+ "name": "stderr",
|
|
|
+ "output_type": "stream",
|
|
|
+ "text": [
|
|
|
+ "Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.\n"
|
|
|
+ ]
|
|
|
+ }
|
|
|
+ ],
|
|
|
+ "source": [
|
|
|
+ "pipeline = transformers.pipeline(\n",
|
|
|
+ " \"text-generation\",\n",
|
|
|
+ " model=MODEL,\n",
|
|
|
+ " model_kwargs={\"torch_dtype\": torch.bfloat16},\n",
|
|
|
+ " device_map=\"auto\",\n",
|
|
|
+ ")\n",
|
|
|
+ "\n",
|
|
|
+ "messages = [\n",
|
|
|
+ " {\"role\": \"system\", \"content\": SYSTEMP_PROMPT},\n",
|
|
|
+ " {\"role\": \"user\", \"content\": INPUT_PROMPT},\n",
|
|
|
+ "]\n",
|
|
|
+ "\n",
|
|
|
+ "outputs = pipeline(\n",
|
|
|
+ " messages,\n",
|
|
|
+ " max_new_tokens=8126,\n",
|
|
|
+ " temperature=1,\n",
|
|
|
+ ")"
|
|
|
+ ]
|
|
|
+ },
|
|
|
+ {
|
|
|
+ "cell_type": "markdown",
|
|
|
+ "id": "612a27e0",
|
|
|
+ "metadata": {},
|
|
|
+ "source": [
|
|
|
+ "We can verify the output from the model"
|
|
|
+ ]
|
|
|
+ },
|
|
|
+ {
|
|
|
+ "cell_type": "code",
|
|
|
+ "execution_count": null,
|
|
|
+ "id": "b8632442-f9ce-4f63-82bd-bb5238a23dc1",
|
|
|
+ "metadata": {},
|
|
|
+ "outputs": [],
|
|
|
+ "source": [
|
|
|
+ "print(outputs[0][\"generated_text\"][-1])"
|
|
|
+ ]
|
|
|
+ },
|
|
|
+ {
|
|
|
+ "cell_type": "code",
|
|
|
+ "execution_count": null,
|
|
|
+ "id": "a61182ea-f4a3-45e1-aed9-b45cb7b52329",
|
|
|
+ "metadata": {},
|
|
|
+ "outputs": [],
|
|
|
+ "source": [
|
|
|
+ "save_string_pkl = outputs[0][\"generated_text\"][-1]['content']"
|
|
|
+ ]
|
|
|
+ },
|
|
|
+ {
|
|
|
+ "cell_type": "markdown",
|
|
|
+ "id": "d495a957",
|
|
|
+ "metadata": {},
|
|
|
+ "source": [
|
|
|
+ "Let's save the output as a pickle file to be used in Notebook 4"
|
|
|
+ ]
|
|
|
+ },
|
|
|
+ {
|
|
|
+ "cell_type": "code",
|
|
|
+ "execution_count": null,
|
|
|
+ "id": "281d3db7-5bfa-4143-9d4f-db87f22870c8",
|
|
|
+ "metadata": {},
|
|
|
+ "outputs": [],
|
|
|
+ "source": [
|
|
|
+ "with open('./resources/podcast_ready_data.pkl', 'wb') as file:\n",
|
|
|
+ " pickle.dump(save_string_pkl, file)"
|
|
|
+ ]
|
|
|
+ },
|
|
|
+ {
|
|
|
+ "cell_type": "code",
|
|
|
+ "execution_count": null,
|
|
|
+ "id": "21c7e456-497b-4080-8b52-6f399f9f8d58",
|
|
|
+ "metadata": {},
|
|
|
+ "outputs": [],
|
|
|
+ "source": [
|
|
|
+ "#fin"
|
|
|
+ ]
|
|
|
+ }
|
|
|
+ ],
|
|
|
+ "metadata": {
|
|
|
+ "kernelspec": {
|
|
|
+ "display_name": "Python 3 (ipykernel)",
|
|
|
+ "language": "python",
|
|
|
+ "name": "python3"
|
|
|
+ },
|
|
|
+ "language_info": {
|
|
|
+ "codemirror_mode": {
|
|
|
+ "name": "ipython",
|
|
|
+ "version": 3
|
|
|
+ },
|
|
|
+ "file_extension": ".py",
|
|
|
+ "mimetype": "text/x-python",
|
|
|
+ "name": "python",
|
|
|
+ "nbconvert_exporter": "python",
|
|
|
+ "pygments_lexer": "ipython3",
|
|
|
+ "version": "3.11.10"
|
|
|
+ }
|
|
|
+ },
|
|
|
+ "nbformat": 4,
|
|
|
+ "nbformat_minor": 5
|
|
|
+}
|