| .. | 
		
		
			
			
			
				
					| __init__.py | 207d2f80e9
					Make code-llama and hf-tgi inference runnable as module | 2 anni fa | 
		
			
			
			
				
					| chat_utils.py | 6d9d48d619
					Use apply_chat_template instead of custom functions | 1 anno fa | 
		
			
			
			
				
					| checkpoint_converter_fsdp_hf.py | 0e54f5634a
					use AutoTokenizer instead of LlamaTokenizer | 1 anno fa | 
		
			
			
			
				
					| llm.py | eeb45e5f2c
					Updated model names for OctoAI | 1 anno fa | 
		
			
			
			
				
					| model_utils.py | d51d2cce9c
					adding sdpa for flash attn | 1 anno fa | 
		
			
			
			
				
					| prompt_format_utils.py | bcdb5b31fe
					Fixing quantization config. Removing prints | 1 anno fa | 
		
			
			
			
				
					| safety_utils.py | f63ba19827
					Fixing tokenizer used for llama 3. Changing quantization configs on safety_utils. | 1 anno fa |