1 год назад · 2f0a006b68
--- a/.github/scripts/spellcheck_conf/wordlist.txt
+++ b/.github/scripts/spellcheck_conf/wordlist.txt
@@ -1451,3 +1451,9 @@ openhathi
 
																 sarvam
															
 
																 subtask
															
 
																 acc
															
 
																+BigBench
															
 
																+IFEval
															
 
																+MuSR
															
 
																+Multistep
															
 
																+multistep
															
 
																+algorithmically
															
--- a/tools/benchmarks/README.md
+++ b/tools/benchmarks/README.md
@@ -1,4 +1,4 @@
 
																 # Benchmarks
															
 
																 * inference - a folder contains benchmark scripts that apply a throughput analysis for Llama models inference on various backends including on-prem, cloud and on-device.
															
 
																-* llm_eval_harness - a folder contains a tool to evaluate fine-tuned Llama models including quantized models focusing on quality.  
															
 
																+* llm_eval_harness - a folder that introduces `lm-evaluation-harness`, a tool to evaluate Llama models including quantized models focusing on quality. We also included a recipe that reproduces Meta 3.1 evaluation metrics Using `lm-evaluation-harness` and instructions that reproduce HuggingFace Open LLM Leaderboard v2 metrics.
															
--- a/tools/benchmarks/llm_eval_harness/README.md
+++ b/tools/benchmarks/llm_eval_harness/README.md