Justin Lee
|
dc406b4769
setup meta-eval for benchmark, ray error
|
7 ماه پیش |
Justin Lee
|
9ffb292272
added inspect and modified harness
|
7 ماه پیش |
Justin Lee
|
eea96618cf
batching and parallelization, ran on baseline and lite
|
7 ماه پیش |
Justin Lee
|
becbe77ff3
attempt to fix json output format in eval
|
7 ماه پیش |
Justin Lee
|
2776a35314
harness runcode
|
8 ماه پیش |
Justin Lee
|
62b53676fb
update harness notebook
|
8 ماه پیش |
Justin Lee
|
e52e1d1ab4
updated prompt migration to use benchmark and also mipro, added meta implementation
|
8 ماه پیش |