1 рік тому · 2f0a006b68
--- a/.github/scripts/spellcheck_conf/wordlist.txt
+++ b/.github/scripts/spellcheck_conf/wordlist.txt
@@ -1451,3 +1451,9 @@ openhathi
 
				 sarvam
			
 
				 subtask
			
 
				 acc
			
 
				+BigBench
			
 
				+IFEval
			
 
				+MuSR
			
 
				+Multistep
			
 
				+multistep
			
 
				+algorithmically
			
--- a/tools/benchmarks/README.md
+++ b/tools/benchmarks/README.md
@@ -1,4 +1,4 @@
 
				 # Benchmarks
			
 
				 
			
 
				 * inference - a folder contains benchmark scripts that apply a throughput analysis for Llama models inference on various backends including on-prem, cloud and on-device.
			
 
				-* llm_eval_harness - a folder contains a tool to evaluate fine-tuned Llama models including quantized models focusing on quality.  
			
 
				+* llm_eval_harness - a folder that introduces `lm-evaluation-harness`, a tool to evaluate Llama models including quantized models focusing on quality. We also included a recipe that reproduces Meta 3.1 evaluation metrics Using `lm-evaluation-harness` and instructions that reproduce HuggingFace Open LLM Leaderboard v2 metrics.
			
--- a/tools/benchmarks/llm_eval_harness/README.md
+++ b/tools/benchmarks/llm_eval_harness/README.md