Sfoglia il codice sorgente

fix the idx issue for labels

Kai Wu 8 mesi fa
parent
commit
abe44c0237
1 ha cambiato i file con 1 aggiunte e 1 eliminazioni
  1. 1 1
      recipes/quickstart/finetuning/datasets/vqa_dataset.py

+ 1 - 1
recipes/quickstart/finetuning/datasets/vqa_dataset.py

@@ -38,7 +38,7 @@ def tokenize_dialog(dialog, images, processor):
             # found prompt header, indicating that this seq should be masked
             labels[last_idx:idx+1] = [-100] * (idx-last_idx+1)
         else:
-            last_idx = idx
+            last_idx = idx+1
         # Lastly mask all the assistant header prompt <|start_header_id|>assistant<|end_header_id|>, which has been tokenized to [128006, 78191, 128007]
     assistant_header_seq = [128006, 78191, 128007]
     labels = replace_target(assistant_header_seq,labels)