zenodia d6a955ce62 correction on links		4 jaren geleden
..
English	d6a955ce62 correction on links	4 jaren geleden
Dockerfile	d41c326e5f update SuperPOD discreption and README	4 jaren geleden
README.md	320e3a0476 edit README to fetch toy data	4 jaren geleden

Practical Guide to Train Megatron-LM with your own language

This folder contains contents for Practical Guide to train your own GPT models with Megatron-LM bootcamp.

Introduction to Megatron-LM workflow
Customize Megatron-LM for your own langauge
Advise on cleaning + preparing the data for training
Hands-on practice training your own GPTBPE Tokenizer
Hands-on profiling Megatron-LM GPT training with varying config

Prerequisites

To run this tutorial you will need a machine with at least 2 x NVIDIA GPUs.

Install the latest Docker or Singularity.
The base containers required for the lab may require users to create a NGC account and generate an API key (https://docs.nvidia.com/ngc/ngc-catalog-user-guide/index.html#registering-activating-ngc-account)
you will also need to run the below script in order to obtain the toy data git clone https://github.com/gpuhackathons-org/gpubootcamp.git && cd gpubootcamp && git checkout megatron && cd ./ai/Megatron/English/Python/ && mkdir ./dataset/SV/ && mkdir ./datset/EN/ && bash ./source_code/download_webnyheter2013.sh

#Tutorial Duration The total bootcamp material would take approximately 12 hours ( including solving mini-challenge ).

Creating containers

To start with, you will have to build a Docker or Singularity container.

Docker Container

To build a docker container, run: sudo docker build --network=host -t <imagename>:<tagnumber> .

For instance: sudo docker build -t myimage:1.0 .

The code labs have been written using Jupyter notebooks and a Dockerfile has been built to simplify deployment. In order to serve the docker instance for a student, it is necessary to expose port 8888 from the container, for instance, the following command would expose port 8888 inside the container as port 8888 on the lab machine:

sudo docker run --rm -it --gpus=all -p 8888:8888 -p 8000:8000 myimage:1.0

When this command is run, you can browse to the serving machine on port 8888 using any web browser to access the labs and port 8000 for dlprofviewer server. For instance, from if they are running on the local machine the web browser should be pointed to http://localhost:8888. The --gpus flag is used to enable all NVIDIA GPUs during container runtime. The --rm flag is used to clean an temporary images created during the running of the container. The -it flag enables killing the jupyter server with ctrl-c. This command may be customized for your hosting environment.

Once inside the container launch the jupyter notebook by typing the following command jupyter-lab --no-browser --allow-root --ip=0.0.0.0 --port=8888 --NotebookApp.token="" --NotebookApp.iopub_data_rate_limit=1.0e15 --notebook-dir=/workspace/python/jupyter_notebook/

Then, open the jupyter notebook in browser: http://localhost:8888 Start working on the lab by clicking on the Start_Here.ipynb notebook.

Singularity Container

To build the singularity container, run: sudo singularity build --sandbox <image_name>.simg Singularity

and copy the files to your local machine to make sure changes are stored locally: singularity run --writable <image_name>.simg cp -rT /workspace/ ~/workspace

export the bootcamp Megatron directory

export SINGULARITY_BINDPATH="<Your_local_Bootcamp_Megatron_Directory>"

Then, run the container: singularity run --nv --writable <image_name>.simg jupyter lab --notebook-dir=/workspace/python/jupyter_notebook/--port=8000 --ip=0.0.0.0 --no-browser --NotebookApp.token=""