Commit History

Autor SHA1 Mensaxe Data
  celestinoalan 2a94bfff26 Append epoch rather than best val. loss to val_loss hai 11 meses
  celestinoalan d6ae2031c3 Fix fine-tuning training loss accumulation (#725) hai 1 ano
  Sanyam Bhutani 5fdeb55058 chore: update train_utils.py (#690) hai 1 ano
  Lucas Ventura 2774065891 Improve model checkpoint saving logic (#691) hai 1 ano
  Ikko Eltociear Ashimine 139824aafa chore: update train_utils.py hai 1 ano
  Kai Wu c18a0d277f changed dataset to ocrvqa hai 1 ano
  Kai Wu bd22f407d5 changed to aid2 dataset hai 1 ano
  Kai Wu 79dbe05a94 batch fine-tuning lmm working hai 1 ano
  Kai Wu 12da109823 Merge branch 'main' into lmm_finetune hai 1 ano
  Kai Wu bb990be967 not working, need create dataloader function hai 1 ano
  Matthias Reso 778e31e35c Fix checkpoint saving (#650) hai 1 ano
  Kai Wu ee204ccb98 working now hai 1 ano
  Kai Wu b566582a86 finetune not working with fsdp hai 1 ano
  Matthias Reso eca526526c Use new get_model_state_dict api for save_pretrained peft model (#629) hai 1 ano
  Matthias Reso 7a8c52cb38 Remove pkg_resources.packaging hai 1 ano
  simwiki 66e1867120 Fix save metric FileNotFoundError when finetuning hai 1 ano
  Kai Wu 26e877fd42 changed readme, unified the context interface and added get_flops_per_sec() hai 1 ano
  Kai Wu d9558c11ca changed context name and add more docs hai 1 ano
  Kai Wu 03f1ca7817 fixed some typo to pass spellcheck hai 1 ano
  Kai Wu 7b1a9413d2 fixed a typo hai 1 ano
  Kai Wu 41434dc825 formatted and removed duplicated or unused function get_total_flops() and byte2mb() hai 1 ano
  Kai Wu f2e80bae22 created a FlopMeasure class on top of FlopCounterMode instead of keep of copy of our own tflop_counter.py hai 1 ano
  Kai Wu 69e46887b4 handling incorrect profiling early stop caused by max_train_steps and add profiler.step() for each train step hai 1 ano
  Kai Wu 34e0bf4c6e second draft of this feature, seems to be working now hai 1 ano
  Kai Wu a35519ee90 fixed typo and handling unexpected exit hai 1 ano
  Kai Wu 2a5de9b448 first draft of flop counter feature hai 1 ano
  Kai Wu e6f69f84ad add max_steps_reached to reduce redundancy hai 1 ano
  Kai Wu fa0a389f74 add max_step feature for training and eval hai 1 ano
  jpgard 6954b16b3b only save training params on rank 0 hai 1 ano
  Hamid Shojanazeri 761b7e6e51 adding wandb_run ro eval hai 1 ano