Matthias Reso
|
c167945448
remove 405B ft doc
|
1 anno fa |
Matthias Reso
|
b0b4e16aec
Update docs/multi_gpu.md
|
1 anno fa |
Matthias Reso
|
e2f77dbc21
fix quant config
|
1 anno fa |
Matthias Reso
|
6ef9a78458
Fix issues with quantization_config == None
|
1 anno fa |
Matthias Reso
|
b319a9fb8c
Fix lint issue
|
1 anno fa |
Matthias Reso
|
a3fd369127
Ref from infernce recipes to vllm for 405B
|
1 anno fa |
Matthias Reso
|
a8f2267324
Added multi node doc to multigpu_finetuning.md
|
1 anno fa |
Matthias Reso
|
afb3b75892
Add 405B + QLoRA + FSDP to multi_gpu.md doc
|
1 anno fa |
Matthias Reso
|
939c88fb04
Add 405B + QLoRA ref to LLM finetung
|
1 anno fa |
Matthias Reso
|
d2fd9c163a
Added doc for multi-node vllm inference
|
1 anno fa |
Matthias Reso
|
c9ae014459
Enable pipeline parallelism through use of AsyncLLMEngine in vllm inferecen + enable use of lora adapter
|
1 anno fa |
Matthias Reso
|
0920b1a415
Fix quantization for inference
|
1 anno fa |
Matthias Reso
|
b36830fdf6
Fix reading in stdin for chat_completion, remove padding as we're feeding single samples
|
1 anno fa |
Matthias Reso
|
f0aa8e31ca
Update url
|
1 anno fa |
Matthias Reso
|
9db61e5235
Refactored infeence to allow multiple requests through gradio
|
1 anno fa |
Thomas Robinson
|
fd9f52f710
Modify prompt_format_utils with changes necessary for Llama Guard 3 (#1)
|
1 anno fa |
Cyrus Nikolaidis
|
0c57646481
Prompt Guard Tutorial
|
1 anno fa |
Hamid Shojanazeri
|
808a3f7a0c
Adding support for FSDP+Qlora. (#572)
|
1 anno fa |
Jeff Tang
|
ba447971f0
Port of DLAI LlamaIndex Agent short course lessons 2-4 to use Llama 3 (#594)
|
1 anno fa |
Jeff Tang
|
935ad46a0d
wordlist update for DLAI LlamaIndex Agent short course
|
1 anno fa |
Jeff Tang
|
af8838463e
added lesson summary in each notebook and README
|
1 anno fa |
Jeff Tang
|
aaeba04bd6
README update
|
1 anno fa |
Jeff Tang
|
353ceaae74
fix of cell order issue for L3
|
1 anno fa |
dongwang218
|
ed3136f117
Update hf weight conversion script to llama 3 (#551)
|
1 anno fa |
Kai Wu
|
f6617fb86a
changed --pure_bf16 to --fsdp_config.pure_bf16 and corrected "examples/" path (#587)
|
1 anno fa |
Jeff Tang
|
2e4ea5b728
cell cleanup
|
1 anno fa |
Jeff Tang
|
0fef52e846
README links fixed
|
1 anno fa |
Jeff Tang
|
ebbf362576
L4 - replace groq with fireworks to fix rate limit
|
1 anno fa |
Jeff Tang
|
945175a2ea
l3 cleanup
|
1 anno fa |
Jeff Tang
|
b585e1f211
L2 llm fix - use fireworks llama 3 to overcome the groq rate limit
|
1 anno fa |