|
2 månader sedan | |
---|---|---|
.. | ||
agent | 3 månader sedan | |
README.md | 2 månader sedan |
This project provides a comprehensive guide to creating an AI-powered browser agent capable of autonomously navigating and interacting with websites. By leveraging the capabilities of Llama 3.2 Vision, Playwright, and Together AI, this agent can perform tasks seamlessly while understanding both visual and textual content.
Persistent Session Management: Maintains browser sessions for continuous interaction.
Search for a product on Amazon.
Find the cheapest flight to Tokyo.
Purchase tickets for the next Warriors game.
Environment setup instructions
Browser automation guides using Playwright
Structured prompting techniques for guiding the LLM in task execution
Content comprehension utilizing Llama 3.2 Vision
Creating a persistent and intelligent browser agent for real-world applications
For a detailed explanation and demo video, visit: Blog Post and Demo Video
Before getting started, please make sure to setup Together.ai and get an API key from here.
Feel free to reach out with any questions or feedback!