Selaa lähdekoodia

DoubleCheck all runs on Berzelius

zenodia 2 vuotta sitten
vanhempi
commit
18f0793867

+ 33 - 33
ai/Megatron/English/Python/jupyter_notebook/Lab2-4_customize_process2mmap.ipynb

@@ -2,7 +2,7 @@
  "cells": [
   {
    "cell_type": "markdown",
-   "id": "crazy-behalf",
+   "id": "encouraging-melissa",
    "metadata": {},
    "source": [
     "## Customize preprocess_data.py\n",
@@ -27,7 +27,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "suburban-coast",
+   "id": "regulation-medium",
    "metadata": {},
    "source": [
     "1. Convert the extracted raw Swedish text from webnyheter2013.txt to webnyheter2013.json."
@@ -36,7 +36,7 @@
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "parliamentary-accountability",
+   "id": "alone-asian",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -45,7 +45,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "pursuant-ghost",
+   "id": "corporate-honor",
    "metadata": {},
    "source": [
     "Below is the expected outputs :\n",
@@ -58,7 +58,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "insured-excitement",
+   "id": "diagnostic-mercury",
    "metadata": {},
    "source": [
     "2. Generate the mmap format files by default preprocess_data.py as the first step to ensure we have data necessary for the next notebook to run, in case time runs out."
@@ -66,22 +66,22 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 7,
-   "id": "palestinian-locking",
+   "execution_count": null,
+   "id": "unique-assistant",
    "metadata": {},
    "outputs": [],
    "source": [
     "INPUT_JSON_FILE='../dataset/SV/webnyheter2013.json'\n",
-    "OUTPUT_PATH='../dataset/SV/webnyheter2013_56kvocab'\n",
-    "VOCAB_FILE='../dataset/SV/56k/vocab.json'\n",
-    "MERGE_FILE='../dataset/SV/56k/merges.txt'\n",
+    "OUTPUT_PATH='../dataset/SV/webnyheter2013_32kvocab'\n",
+    "VOCAB_FILE='../dataset/SV/32k/vocab.json'\n",
+    "MERGE_FILE='../dataset/SV/32k/merges.txt'\n",
     "NUM_CPUS=16"
    ]
   },
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "collect-soccer",
+   "id": "affecting-brave",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -99,7 +99,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "italian-mount",
+   "id": "mathematical-crawford",
    "metadata": {},
    "source": [
     "Below is the expected outputs :\n",
@@ -116,7 +116,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "surrounded-clothing",
+   "id": "involved-geneva",
    "metadata": {},
    "source": [
     "Now we get the default mmap files (xxx.bin and xxx.idx ) and therefore guarantee we have the data needed for the next notebook to run disregard whether we finish the mini-challenge or not. \n",
@@ -130,8 +130,8 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 2,
-   "id": "addressed-month",
+   "execution_count": null,
+   "id": "understood-poultry",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -140,7 +140,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "growing-restriction",
+   "id": "north-pension",
    "metadata": {},
    "source": [
     "<a id=\"Custom-Sentence-Splitter\"></a>"
@@ -148,7 +148,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "chemical-selection",
+   "id": "textile-australian",
    "metadata": {},
    "source": [
     "The custom sentence-splitter `cut_sentence_with_quotation_marks` function is provided below for your convenience, please integrate this custom function into `MYpreprocess_data.py`."
@@ -156,8 +156,8 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 3,
-   "id": "vital-latino",
+   "execution_count": null,
+   "id": "swedish-column",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -192,7 +192,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "musical-benjamin",
+   "id": "heated-drunk",
    "metadata": {},
    "source": [
     "<a id=\"Mini-Challenge\"></a>"
@@ -200,7 +200,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "robust-apparel",
+   "id": "normal-rachel",
    "metadata": {},
    "source": [
     "---\n",
@@ -221,8 +221,8 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 2,
-   "id": "decimal-enlargement",
+   "execution_count": null,
+   "id": "adopted-yeast",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -435,7 +435,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "innocent-delight",
+   "id": "accessory-banana",
    "metadata": {},
    "source": [
     "Below cell block specify all the input parameters in order to run `MYpreprocess_data.py`. \n",
@@ -445,8 +445,8 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 11,
-   "id": "geographic-convention",
+   "execution_count": null,
+   "id": "overhead-hydrogen",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -459,7 +459,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "unavailable-steps",
+   "id": "interior-healthcare",
    "metadata": {},
    "source": [
     "Below code block is a ReRun cell to launch `MYpreprocess_data.py` and produce the customSentenceSplit_text_document.bin and customSentenceSplit_text_document.idx files, if the script runs successfully.\n",
@@ -472,7 +472,7 @@
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "smoking-memory",
+   "id": "compact-access",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -490,7 +490,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "strange-maldives",
+   "id": "automatic-gravity",
    "metadata": {},
    "source": [
     "Check whether these two files : `customSentenceSplit_text_document.bin` and `customSentenceSplit_text_document.idx` files were successfully generated and is in the correct folder under dataset."
@@ -499,7 +499,7 @@
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "difficult-library",
+   "id": "substantial-spare",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -509,7 +509,7 @@
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "strategic-confusion",
+   "id": "civic-airplane",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -519,7 +519,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "temporal-spring",
+   "id": "velvet-tennessee",
    "metadata": {},
    "source": [
     "-----\n",
@@ -528,7 +528,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "parental-tourism",
+   "id": "rising-ready",
    "metadata": {},
    "source": [
     "-----\n",

+ 47 - 54
ai/Megatron/English/Python/jupyter_notebook/Lab2-5_run_Megatron_with_varying_config.ipynb

@@ -2,7 +2,7 @@
  "cells": [
   {
    "cell_type": "markdown",
-   "id": "rising-software",
+   "id": "alike-prisoner",
    "metadata": {},
    "source": [
     "## Scale up model size\n",
@@ -37,7 +37,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "historic-eating",
+   "id": "material-finland",
    "metadata": {},
    "source": [
     "\n",
@@ -47,7 +47,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "cleared-toolbox",
+   "id": "driven-drawing",
    "metadata": {},
    "source": [
     "Modify and rerun the code blocks below to obtain a even bigger GPT model. \n",
@@ -59,7 +59,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "large-buying",
+   "id": "proprietary-marketing",
    "metadata": {},
    "source": [
     "<a id=\"MODIFY_CELL\"></a>"
@@ -67,7 +67,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "approved-beatles",
+   "id": "adjustable-engineer",
    "metadata": {},
    "source": [
     "Always clean the checkpoint folder to ensure trainining start from scratch."
@@ -75,71 +75,64 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 1,
-   "id": "attended-vault",
+   "execution_count": null,
+   "id": "other-parts",
    "metadata": {},
    "outputs": [],
    "source": [
-    "!rm -fr ../sv_ckpt/* "
+    "!rm -fr ../sv_ckpt/* \n",
+    "!rm -fr ../dataset/SV/*.npy"
    ]
   },
   {
    "cell_type": "code",
-   "execution_count": 2,
-   "id": "engaging-ocean",
+   "execution_count": null,
+   "id": "invisible-pepper",
    "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "Overwriting ./Megatron-LM/profile_SVGPT_BIG.sh\n"
-     ]
-    }
-   ],
+   "outputs": [],
    "source": [
     "%%writefile ./Megatron-LM/SV_GPT_goingBIG.sh\n",
     "# Copyright (c) 2020 NVIDIA Corporation.  All rights reserved.\n",
+    "# Copyright (c) 2020 NVIDIA Corporation.  All rights reserved.\n",
+    "GPUS_PER_NODE=2 # <--- remember to change the number of GPUs you actually have in your system\n",
+    "# Change for multinode config\n",
     "MASTER_ADDR=localhost\n",
     "MASTER_PORT=6000\n",
     "NNODES=1 #<-- currently we are using 1 node multigpus\n",
     "NODE_RANK=0\n",
-    "WORLD_SIZE=2 \n",
-    "GPUS_PER_NODE=2  \n",
-    "\n",
+    "WORLD_SIZE=2 # <--- remember to change the number of GPUs you actually have in your system\n",
     "\n",
-    "CHECKPOINT_PATH='../sv_ckpt/'\n",
-    "DATA_PATH='../dataset/SV/webnyheter2013_56kvocab_text_document'\n",
-    "VOCAB_FILE='../dataset/SV/56k/vocab.json'\n",
-    "MERGE_FILE='../dataset/SV/56k/merges.txt'\n",
-    "PROFILE_OUTPUT_PATH='../profiles/SV/nsys_sv_' # modify this to your own profile path\n",
-    "\n",
-    "# -------------------- #####  Begin of modifiable block ##### -------------------- \n",
+    "### modify this section to point the file to its own path \n",
+    "CHECKPOINT_PATH='../sv_ckpt/' ## modify this path if you customize it \n",
+    "DATA_PATH='../dataset/SV/webnyheter2013_32kvocab_text_document' ## modify this path if you customize it \n",
+    "VOCAB_FILE='../dataset/SV/32k/vocab.json' ## modify this path if you customize it \n",
+    "MERGE_FILE='../dataset/SV/32k/merges.txt' ## modify this path if you customize it \n",
+    "PROFILE_OUTPUT_PATH='../profiles/SV/nsys_improved2' # modify this to your own profile path\n",
     "\n",
+    "################   Beginning of modifiable section    ####################\n",
     "TENSOR_MP_SIZE=<FILL_IN>\n",
     "PIPELINE_MP_SIZE=<FILL_IN>\n",
-    "LAYERS=<FILL_IN>\n",
-    "HIDDEN_SZ=<FILL_IN>\n",
+    "NUM_LYS=<FILL_IN>\n",
+    "HIDDEN_SIZE=<FILL_IN>\n",
     "NUM_ATTN_HEADS=<FILL_IN>\n",
-    "MICRO_BZ=<FILL_IN>\n",
-    "GLOBAL_BZ=<FILL_IN>\n",
     "SEQ_LEN=<FILL_IN>\n",
     "MAX_POS_EM=<FILL_IN>\n",
+    "MICRO_BZ=<FILL_IN>\n",
+    "GLOBAL_BZ=<FILL_IN>\n",
     "\n",
-    "# -------------------- #####  End of modifiable blocks ##### ------------------------ \n",
+    "##############   end of modifiable sectio, do NOT modify anything below this line    ####################\n",
     "\n",
-    "##################  DO NOT modify anything below this line ##################\n",
     "export OMP_NUM_THREADS=1\n",
     "DISTRIBUTED_ARGS=\"--nproc_per_node $GPUS_PER_NODE --nnodes $NNODES --node_rank $NODE_RANK --master_addr $MASTER_ADDR --master_port $MASTER_PORT\"\n",
     "\n",
-    "## We turn off nsys profiling decoration to avoid the small overhead\n",
+    "## for nsys run\n",
     "#nsys profile --stats=false --force-overwrite=true --duration=300 --trace=cudnn,cuda,osrt,nvtx -o $PROFILE_OUTPUT_PATH \\\n",
     "python -m torch.distributed.launch $DISTRIBUTED_ARGS \\\n",
-    "    ./Megatron-LM/Dlprof_pretrain_gpt.py \\\n",
+    "    ./Megatron-LM/pretrain_gpt.py \\\n",
     "       --tensor-model-parallel-size ${TENSOR_MP_SIZE} \\\n",
     "       --pipeline-model-parallel-size ${PIPELINE_MP_SIZE} \\\n",
-    "       --num-layers ${LAYERS} \\\n",
-    "       --hidden-size ${HIDDEN_SZ} \\\n",
+    "       --num-layers ${NUM_LYS} \\\n",
+    "       --hidden-size ${HIDDEN_SIZE} \\\n",
     "       --num-attention-heads ${NUM_ATTN_HEADS} \\\n",
     "       --micro-batch-size ${MICRO_BZ} \\\n",
     "       --global-batch-size ${GLOBAL_BZ} \\\n",
@@ -148,7 +141,7 @@
     "       --train-samples 100 \\\n",
     "       --save ${CHECKPOINT_PATH} \\\n",
     "       --load ${CHECKPOINT_PATH} \\\n",
-    "       --data-path 1. ${DATA_PATH} \\\n",
+    "       --data-path ${DATA_PATH} \\\n",
     "       --vocab-file ${VOCAB_FILE} \\\n",
     "       --merge-file ${MERGE_FILE} \\\n",
     "       --data-impl mmap \\\n",
@@ -170,12 +163,12 @@
   },
   {
    "cell_type": "markdown",
-   "id": "determined-cliff",
+   "id": "formal-turner",
    "metadata": {},
    "source": [
-    "Check how big is your model. By modify the parameters in the [params_cnt.sh](./params_cnt.sh)\n",
+    "Check how big is your model. By modify the parameters in the [params_cnt.sh](./params_cnt.sh) to match the training parames above.\n",
     "\n",
-    "I got 6.6 Billion :)  what about you ?\n",
+    "I got 1.6 Billion :)  what about you ?\n",
     "\n",
     "Modify the [params count](./params_cnt.sh) accoring to your training configuration.\n",
     "\n",
@@ -185,7 +178,7 @@
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "green-magic",
+   "id": "welcome-donor",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -194,18 +187,18 @@
   },
   {
    "cell_type": "markdown",
-   "id": "awful-candle",
+   "id": "noticed-trinity",
    "metadata": {},
    "source": [
     "Below is an example of expected outputs:\n",
     "    \n",
-    "        6 <-- One could get different number depend on your training config\n",
-    "        6675628032 <-- One could get different number depend on your training config\n"
+    "        1 <-- One could get different number depend on your training config\n",
+    "        1678049280 <-- One could get different number depend on your training config\n"
    ]
   },
   {
    "cell_type": "markdown",
-   "id": "great-league",
+   "id": "convenient-ontario",
    "metadata": {},
    "source": [
     "Re-run this cell below to get an even bigger GPT model\n",
@@ -220,16 +213,16 @@
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "italian-karma",
+   "id": "representative-kentucky",
    "metadata": {},
    "outputs": [],
    "source": [
-    "!./Megatron-LM/SV_GPT_goingBIG.sh"
+    "!bash ./Megatron-LM/SV_GPT_goingBIG.sh"
    ]
   },
   {
    "cell_type": "markdown",
-   "id": "outstanding-application",
+   "id": "unnecessary-african",
    "metadata": {},
    "source": [
     "Below is an example of expected outputs:\n",
@@ -252,7 +245,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "blessed-grammar",
+   "id": "pretty-handle",
    "metadata": {},
    "source": [
     "---\n",
@@ -263,7 +256,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "velvet-nylon",
+   "id": "caroline-induction",
    "metadata": {},
    "source": [
     "-----\n",
@@ -272,7 +265,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "framed-blood",
+   "id": "ranking-pillow",
    "metadata": {},
    "source": [
     "-----\n",