In this folder, we show various examples in a notebook for running Llama model inference on Azure's serverless API offerings. We will cover:  
* HTTP requests API usage for Llama 3 instruct models in CLI
* HTTP requests API usage for Llama 3 instruct models in Python
* Plug the APIs into LangChain
* Wire the model with Gradio to build a simple chatbot with memory