### Ideas: NotebookLLama

Steps:  
Path:

1. Decide the Topic
- Upload a PDF
OR - Put in a topic -> Scraped 
- Report written 

2. 2 Agents debate/interact? Podcast style -> Write a Transcript

3. TTS Engine (E25 or ) Make the podcast

### Instructions: 

Running 1B-Model: ```python 1B-chat-start.py --temperature 0.7 --top_p 0.9 --system_message "you are acting as an old angry uncle and will debate why LLMs are bad" --user_message "I love LLMs"```

Running Debator: ```python 1B-debating-script.py --initial_topic "The future of space exploration" --system_prompt1 "You are an enthusiastic advocate for space exploration" --system_prompt2 "You are a skeptic who believes we should focus on Earth's problems first" --n_turns 4 --temperature 0.8 --top_p 0.9 --model_name "meta-llama/Llama-3.2-1B-Instruct"```

### Scratch-pad/Running Notes:

Bark is cool but just v6 works great, I tried v9 but its quite robotic and that is sad. 

So Parler is next-its quite cool for prompting 

xTTS-v2 by coquai is cool, however-need to check the license-I think an example is allowed

Torotoise is blocking because it needs HF version that doesnt work with llama-3.2 models so I will probably need to make a seperate env-need to eval if its worth it

Side note: The TTS library is a really cool effort!

Bark-Tests: Best results for speaker/v6 are at ```speech_output = model.generate(**inputs, temperature = 0.9, semantic_temperature = 0.8)
Audio(speech_output[0].cpu().numpy(), rate=sampling_rate)```

Tested sound effects:

- Laugh is probably most effective
- Sigh is hit or miss
- Gasps doesn't work
- A singly hypen is effective
- Captilisation makes it louder

Ignore/Delete this in final stages, right now this is a "vibe-check" for TTS model(s):

- https://github.com/SWivid/F5-TTS: Latest and most popular-"feels robotic"
- Reddit says E2 model from earlier is better

Starting with: Bark but if it falls apart, here is the order

- 0: https://huggingface.co/suno/bark
- 1: https://huggingface.co/WhisperSpeech/WhisperSpeech
- 2: https://huggingface.co/spaces/parler-tts/parler_tts


Vibe check: 
- This is most popular (ever) on HF and features different accents-the samples feel a little robotic and no accent difference: https://huggingface.co/myshell-ai/MeloTTS-English
- Seems to have great documentation but still a bit robotic for my liking: https://coqui.ai/blog/tts/open_xtts
- Super easy with laughter etc but very slightly robotic: https://huggingface.co/suno/bark
- This is THE MOST NATURAL SOUNDING: https://huggingface.co/WhisperSpeech/WhisperSpeech
- This has a lot of promise, even though its robotic, we can use natural voice to add filters or effects: https://huggingface.co/spaces/parler-tts/parler_tts

Higher Barrier to testing (In other words-I was too lazy to test):
- https://huggingface.co/fishaudio/fish-speech-1.4
- https://huggingface.co/facebook/mms-tts-eng
- https://huggingface.co/metavoiceio/metavoice-1B-v0.1
- https://huggingface.co/nvidia/tts_hifigan
- https://huggingface.co/speechbrain/tts-tacotron2-ljspeech


Try later:
- Whisper Colab: 
- https://huggingface.co/parler-tts/parler-tts-large-v1
- https://huggingface.co/myshell-ai/MeloTTS-English
- Bark: https://huggingface.co/suno/bark (This has been insanely popular)
- https://huggingface.co/facebook/mms-tts-eng
- https://huggingface.co/fishaudio/fish-speech-1.4
- https://huggingface.co/mlx-community/mlx_bark
- https://huggingface.co/metavoiceio/metavoice-1B-v0.1
- https://huggingface.co/suno/bark-small

### Resources I used to learn about Suno:

- https://betterprogramming.pub/text-to-audio-generation-with-bark-clearly-explained-4ee300a3713a
- https://colab.research.google.com/drive/1dWWkZzvu7L9Bunq9zvD-W02RFUXoW-Pd?usp=sharing
- https://colab.research.google.com/drive/1eJfA2XUa-mXwdMy7DoYKVYHI1iTd9Vkt?usp=sharing#scrollTo=NyYQ--3YksJY
- https://replicate.com/suno-ai/bark?prediction=zh8j6yddxxrge0cjp9asgzd534