作者 | SHA1 備註 | 提交日期 |
---|---|---|
|
dc406b4769 setup meta-eval for benchmark, ray error | 7 月之前 |
|
9ffb292272 added inspect and modified harness | 7 月之前 |
|
eea96618cf batching and parallelization, ran on baseline and lite | 7 月之前 |
|
becbe77ff3 attempt to fix json output format in eval | 7 月之前 |
|
2776a35314 harness runcode | 8 月之前 |
|
62b53676fb update harness notebook | 8 月之前 |
|
e52e1d1ab4 updated prompt migration to use benchmark and also mipro, added meta implementation | 8 月之前 |