README.md 366 B

In this folder, we show various examples in a notebook for running Llama model inference on Azure's serverless API offerings. We will cover:

  • HTTP requests API usage for Llama 3 instruct models in CLI
  • HTTP requests API usage for Llama 3 instruct models in Python
  • Plug the APIs into LangChain
  • Wire the model with Gradio to build a simple chatbot with memory