浏览代码

fix readme

Kai Wu 6 月之前
父节点
当前提交
844348993f
共有 2 个文件被更改,包括 13 次插入13 次删除
  1. 12 12
      tools/benchmarks/llm_eval_harness/README.md
  2. 1 1
      tools/benchmarks/llm_eval_harness/meta_eval/README.md

文件差异内容过多而无法显示
+ 12 - 12
tools/benchmarks/llm_eval_harness/README.md


+ 1 - 1
tools/benchmarks/llm_eval_harness/meta_eval/README.md

@@ -6,7 +6,7 @@ As Llama models gain popularity, evaluating these models has become increasingly
 ## Disclaimer
 ## Disclaimer
 
 
 
 
-1. **This recipe is not the official implementation** of Llama evaluation. It is based on public third-party libraries, as this implementation is not mirroring Llama evaluation, this may lead to minor differences in the produced numbers.
+1. **This recipe is not the official implementation** of Llama evaluation. Since our internal eval repo isn't public, we want to provide this recipe as an aid for anyone who want to use the datasets we released. It is based on public third-party libraries, as this implementation is not mirroring Llama evaluation, therefore this may lead to minor differences in the produced numbers.
 2. **Model Compatibility**: This tutorial is specifically for Llama 3 based models, as our prompts include Llama 3 special tokens, e.g. `<|start_header_id|>user<|end_header_id|>`. It will not work with models that are not based on Llama 3.
 2. **Model Compatibility**: This tutorial is specifically for Llama 3 based models, as our prompts include Llama 3 special tokens, e.g. `<|start_header_id|>user<|end_header_id|>`. It will not work with models that are not based on Llama 3.
 
 
 ## Insights from Our Evaluation Process
 ## Insights from Our Evaluation Process