Kai Wu
|
2ea7f57991
convertion missing preprocessor_config.json.
|
vor 1 Jahr |
Matthias Reso
|
e2f77dbc21
fix quant config
|
vor 1 Jahr |
Matthias Reso
|
6ef9a78458
Fix issues with quantization_config == None
|
vor 1 Jahr |
Matthias Reso
|
0920b1a415
Fix quantization for inference
|
vor 1 Jahr |
Hamid Shojanazeri
|
d51d2cce9c
adding sdpa for flash attn
|
vor 1 Jahr |
Hamid Shojanazeri
|
db8af96ff0
update the model load with native flash attn
|
vor 1 Jahr |
Matthias Reso
|
4c9cc7d223
Move modules into separate src folder
|
vor 2 Jahren |