This website works better with JavaScript
Home
Explore
Help
Sign In
radu
/
LLamaRecipes
mirror of
https://github.com/facebookresearch/llama-recipes.git
Watch
1
Star
0
Fork
0
Files
Issues
0
Wiki
Tree:
7488295577
Branches
Tags
+refs/pull/572/head
Adding_open_colab
Fix-broken-format-in-preview-for-RAG-chatbo-example
Getting_to_know_Llama
HamidShojanazeri-patch-1
IgorKasianenko-patch-1
IgorKasianenko-patch-2
IgorKasianenko-patch-3
LG3-nits-after-launch
Multi-Modal-RAG-Demo
Remove-Tokenizer-onPrem-vllm-InferenceThroughput
Tool_Calling_Demos
add-old-dirs
add-promptguard-to-safety-checkers
adding_examples_with_aws
albertodepaola-patch-1
albertodepaola-patch-2
amitsangani-patch-1
archive-main
aws-do-fsdp
azure-api-example
benchmark-inference-throughput-cloud-api
book-character-mindmap
carljparker/ignore-429-in-link-check
chat-completion-fix
chat_pipeline
chat_prompt_fix
chatbot
chatbot-recipe
chatbot-with-conversation-history
chatbot-with-llama
chauhang-patch-1
check-public
chester-rag-chatbot-example
cmodi-meta-patch-1
codellama-70b
coding-assistant
connect_notebook_update
connortreacy-patch-1
cuda_profiler
dan/add-groq-cookbook-recipes
dan/add-groq-recipes
dan/fix-llama3-cookbook-example
data-tool
dead_link_fix
demos-4-llama3
demos4llama3v2
demos4llama3v3
demos4llama3v4
demos4llama3v5
dir-change
distillation-tutorial
dlai_agents
dlai_agents_colab_links
document-metadata-extraction
eval_harness
feat/tool_usage_prompt
feature/custom_dataset
feature/fsdp2
feature/length_based_batch_sampling
feature/oasst1_dataset
feature/package_distribution
feature/peft_quickstart_nb
feature/technical-blog-generator
fix-file
fix-lg-notebook-broken-dep-on-llama
fix-package-naming
fix-raft-dataset
fix-step-1
fix/amp_training
fix/contribution.md
fix/custom_dataset_chat_template
fix/image-link-whatsapp-llama4
fix/invalidate_label_for_chat
fix/load_model_with_torch_dtype_auto
fix/make_tests_run_on_cpu_instance
fix/max_length
fix/missing_copyright_headers
fix/missing_license_header
fix/no_cuda_in_test_finetuning
fix/plotting-json
fix/prefix_and_adapter_peft
fix/remove_deprecated_pytest_cmdline_preparse
fix/remove_pkg_resources
fix/test_on_cpu_only
fix/unit_test_3.2
fix/unit_tests
fix/update_octoai_model_names
fix/use_package_in_quickstart_notebook
fix/vocab_size_mismatch_inference
fix_alpaca
fix_chat_example
fix_duplicate_model_load
fix_eval
fix_local_links
fix_readme
fixes
fiximport
fixing-readme-links
fixing_chat_example
flop_counter
flop_counter_gc
fsdp-eks
fsdp-qlora
fsdp_lmm
fsdp_optimizer_overlap
ft-fw
ft-fw-2
generating-codebase-docs
gmagent
hotfix_pytest_cpu_gha_runner
ibm-wxai
image-finetuning
init27-patch-1
init27-patch-2
l3-update
leaderboardv2
lg-example-fix
llama-triage-tool
llama4_eval
llama4_ft
llama_4_api_recipes
llama_4_api_release
llamaguard-notebook-colab-link-fix
main
move_resposible_ai
optimizer_overlap
padding_updates
peft-update
pia-refactor
pptx_to_transcript
prompt-migration
prompt-ops-refactor
purpla_llama_typo_fix
qlora-fsdp
readme_update
refactor-main
refactor/move_folders
release/0.0.1
release/0.0.4.post1
release/v0.0.4
remove-duplicate-blog-generator
remove-warnings-from-reqs
research-paper-analyser
researcher
revert-profiling
rework-lg-notebook
sagemaker-demo
shorfix
small-patch
ssdp
struc_parser
subramen-patch-1
subramen-patch-2
subramen-patch-3
subramen-patch-4
subramen-patch-5
subramen-patch-6
subramen-patch-7
summarization
suraj-refactor-patch
text2sql
torchtune
transformers-version-updated
update-android-app
update-aws-recipes-folder
update-prompt-guard-tutorial
update-video
varunfb-patch-1
varunfb-patch-2
vertexNotebooks
zero-to-llama
v0.0.5
v0.0.4
v0.0.3
v0.0.2
v0.0.4.post1
LLamaRecipes
/
recipes
/
inference
/
model_servers
/
README.md
README.md
266 B
History
Raw
Running Llama 3 On-Prem with vLLM and TGI
This tutorial shows how to use Llama 3 with
vLLM
and Hugging Face
TGI
to build Llama 3 on-prem apps.