浏览代码

update readme

Kai Wu 7 月之前
父节点
当前提交
576e574e31
共有 17 个文件被更改,包括 26 次插入26 次删除
  1. 2 2
      tools/benchmarks/llm_eval_harness/README.md
  2. 24 24
      tools/benchmarks/llm_eval_harness/meta_eval_reproduce/README.md
  3. 0 0
      tools/benchmarks/llm_eval_harness/meta_eval/eval_config.yaml
  4. 0 0
      tools/benchmarks/llm_eval_harness/meta_eval/meta_template/bbh/bbh_3shot_cot.yaml
  5. 0 0
      tools/benchmarks/llm_eval_harness/meta_eval/meta_template/bbh/utils.py
  6. 0 0
      tools/benchmarks/llm_eval_harness/meta_eval/meta_template/gpqa_cot/gpqa_0shot_cot.yaml
  7. 0 0
      tools/benchmarks/llm_eval_harness/meta_eval/meta_template/gpqa_cot/utils.py
  8. 0 0
      tools/benchmarks/llm_eval_harness/meta_eval/meta_template/ifeval/ifeval.yaml
  9. 0 0
      tools/benchmarks/llm_eval_harness/meta_eval/meta_template/ifeval/utils.py
  10. 0 0
      tools/benchmarks/llm_eval_harness/meta_eval/meta_template/math_hard/math_hard_0shot_cot.yaml
  11. 0 0
      tools/benchmarks/llm_eval_harness/meta_eval/meta_template/math_hard/utils.py
  12. 0 0
      tools/benchmarks/llm_eval_harness/meta_eval/meta_template/meta_instruct.yaml
  13. 0 0
      tools/benchmarks/llm_eval_harness/meta_eval/meta_template/meta_pretrain.yaml
  14. 0 0
      tools/benchmarks/llm_eval_harness/meta_eval/meta_template/mmlu_pro/mmlu_pro_5shot_cot_instruct.yaml
  15. 0 0
      tools/benchmarks/llm_eval_harness/meta_eval/meta_template/mmlu_pro/mmlu_pro_5shot_cot_pretrain.yaml
  16. 0 0
      tools/benchmarks/llm_eval_harness/meta_eval/meta_template/mmlu_pro/utils.py
  17. 0 0
      tools/benchmarks/llm_eval_harness/meta_eval/prepare_meta_eval.py

文件差异内容过多而无法显示
+ 2 - 2
tools/benchmarks/llm_eval_harness/README.md


文件差异内容过多而无法显示
+ 24 - 24
tools/benchmarks/llm_eval_harness/meta_eval_reproduce/README.md


tools/benchmarks/llm_eval_harness/meta_eval_reproduce/eval_config.yaml → tools/benchmarks/llm_eval_harness/meta_eval/eval_config.yaml


tools/benchmarks/llm_eval_harness/meta_eval_reproduce/meta_template/bbh/bbh_3shot_cot.yaml → tools/benchmarks/llm_eval_harness/meta_eval/meta_template/bbh/bbh_3shot_cot.yaml


tools/benchmarks/llm_eval_harness/meta_eval_reproduce/meta_template/bbh/utils.py → tools/benchmarks/llm_eval_harness/meta_eval/meta_template/bbh/utils.py


tools/benchmarks/llm_eval_harness/meta_eval_reproduce/meta_template/gpqa_cot/gpqa_0shot_cot.yaml → tools/benchmarks/llm_eval_harness/meta_eval/meta_template/gpqa_cot/gpqa_0shot_cot.yaml


tools/benchmarks/llm_eval_harness/meta_eval_reproduce/meta_template/gpqa_cot/utils.py → tools/benchmarks/llm_eval_harness/meta_eval/meta_template/gpqa_cot/utils.py


tools/benchmarks/llm_eval_harness/meta_eval_reproduce/meta_template/ifeval/ifeval.yaml → tools/benchmarks/llm_eval_harness/meta_eval/meta_template/ifeval/ifeval.yaml


tools/benchmarks/llm_eval_harness/meta_eval_reproduce/meta_template/ifeval/utils.py → tools/benchmarks/llm_eval_harness/meta_eval/meta_template/ifeval/utils.py


tools/benchmarks/llm_eval_harness/meta_eval_reproduce/meta_template/math_hard/math_hard_0shot_cot.yaml → tools/benchmarks/llm_eval_harness/meta_eval/meta_template/math_hard/math_hard_0shot_cot.yaml


tools/benchmarks/llm_eval_harness/meta_eval_reproduce/meta_template/math_hard/utils.py → tools/benchmarks/llm_eval_harness/meta_eval/meta_template/math_hard/utils.py


tools/benchmarks/llm_eval_harness/meta_eval_reproduce/meta_template/meta_instruct.yaml → tools/benchmarks/llm_eval_harness/meta_eval/meta_template/meta_instruct.yaml


tools/benchmarks/llm_eval_harness/meta_eval_reproduce/meta_template/meta_pretrain.yaml → tools/benchmarks/llm_eval_harness/meta_eval/meta_template/meta_pretrain.yaml


tools/benchmarks/llm_eval_harness/meta_eval_reproduce/meta_template/mmlu_pro/mmlu_pro_5shot_cot_instruct.yaml → tools/benchmarks/llm_eval_harness/meta_eval/meta_template/mmlu_pro/mmlu_pro_5shot_cot_instruct.yaml


tools/benchmarks/llm_eval_harness/meta_eval_reproduce/meta_template/mmlu_pro/mmlu_pro_5shot_cot_pretrain.yaml → tools/benchmarks/llm_eval_harness/meta_eval/meta_template/mmlu_pro/mmlu_pro_5shot_cot_pretrain.yaml


tools/benchmarks/llm_eval_harness/meta_eval_reproduce/meta_template/mmlu_pro/utils.py → tools/benchmarks/llm_eval_harness/meta_eval/meta_template/mmlu_pro/utils.py


tools/benchmarks/llm_eval_harness/meta_eval_reproduce/prepare_meta_eval.py → tools/benchmarks/llm_eval_harness/meta_eval/prepare_meta_eval.py