Commit History

Autor SHA1 Mensaxe Data
  Matthias Reso 47ae6d0326 Remove print as it breaks progress bar and update progress bar description instead %!s(int64=2) %!d(string=hai) anos
  lchu feaa344af3 resolve conflicts %!s(int64=2) %!d(string=hai) anos
  Hamid Shojanazeri 75f291fe1c resolved conflicts %!s(int64=2) %!d(string=hai) anos
  Hamid Shojanazeri 44ef280d31 adding flash attention and xformer memory efficient through PT SDPA %!s(int64=2) %!d(string=hai) anos
  luoyifan 79d0d4fc4e fix some typos. %!s(int64=2) %!d(string=hai) anos
  lchu 80a4c36707 further fix #90 %!s(int64=2) %!d(string=hai) anos
  Hamid Shojanazeri 88d3e1febc fix the save_train_param condition %!s(int64=2) %!d(string=hai) anos
  Hamid Shojanazeri 62be60355a resolving conflicts %!s(int64=2) %!d(string=hai) anos
  Hamid Shojanazeri 017cadd04b Merge branch 'checkpoint_handler_path_fix' of https://github.com/facebookresearch/llama-recipes into checkpoint_handler_path_fix %!s(int64=2) %!d(string=hai) anos
  Hamid Shojanazeri 4f70348b94 remove the redundant lr step %!s(int64=2) %!d(string=hai) anos
  Hamid Shojanazeri 5b916114eb merge main branch %!s(int64=2) %!d(string=hai) anos
  Hamid Shojanazeri 668c364f6b add rank to save_train_params %!s(int64=2) %!d(string=hai) anos
  Hamid Shojanazeri 231c9e7da9 adding train_param.yaml saving for fsdp checkpoint loading for inference %!s(int64=2) %!d(string=hai) anos
  Hamid Shojanazeri 41dd7ff1cb Merge branch 'main' into checkpoint_handler_path_fix %!s(int64=2) %!d(string=hai) anos
  Hamid Shojanazeri a955ed1999 added checks for dist barrier and commented cuda exapnadable segements and dist_dbug %!s(int64=2) %!d(string=hai) anos
  Hamid Shojanazeri a2403c7c1a clean up %!s(int64=2) %!d(string=hai) anos
  Hamid Shojanazeri e9559d2669 fixing the train/eval_loss calcualtion %!s(int64=2) %!d(string=hai) anos
  Hamid Shojanazeri 4ba4400a75 adding dist barrier before and after checkpointing %!s(int64=2) %!d(string=hai) anos
  Hamid Shojanazeri a49a2c2804 adding PT cuda allocation expand flag %!s(int64=2) %!d(string=hai) anos
  Hamid Shojanazeri 442c1ccf7c adding barrier to end of trainer loop %!s(int64=2) %!d(string=hai) anos
  Hamid Shojanazeri f74d57dc08 printing scores based on fsdp usage or single gpu %!s(int64=2) %!d(string=hai) anos
  Hamid Shojanazeri 3d887ea483 update with active memory and removing rank0 for eval score %!s(int64=2) %!d(string=hai) anos
  Hamid Shojanazeri bedb96b78a fixing the full state path in checkpoint handler %!s(int64=2) %!d(string=hai) anos
  Hamid Shojanazeri 563e572f7c adding active mem stat %!s(int64=2) %!d(string=hai) anos
  Hamid Shojanazeri bd01f64cbd Merge branch 'main' into fix-cuda_id %!s(int64=2) %!d(string=hai) anos
  Andrew Gu 71fdc4920a Save memory and fix typos %!s(int64=2) %!d(string=hai) anos
  Hamid Shojanazeri a7156dfb5d fixing the cuda id %!s(int64=2) %!d(string=hai) anos
  Hamid Shojanazeri 707af7ea24 adding cuda:0 for non-fsdp situations %!s(int64=2) %!d(string=hai) anos
  Hamid Shojanazeri 6678be75ad fixing identation %!s(int64=2) %!d(string=hai) anos
  Hamid Shojanazeri 6a84e9e4d5 fixing scaler for both fsdp and non fsdp %!s(int64=2) %!d(string=hai) anos