{ "cells": [ { "cell_type": "markdown", "id": "trained-reform", "metadata": {}, "source": [ "## The Bootcamp computer environment, a SuperPOD cluster -\n", "\n", "For this bootcamp We will get access to NVIDIA DGX A100 systems. In general, it is highly recommanded to have access to large compute cluster when training very large language models.\n", "\n" ] }, { "cell_type": "markdown", "id": "enabling-jason", "metadata": {}, "source": [ "## Learning Objectives\n", "This bootcamp is designed to help you quickly go through one time the default Magatron workflow ( Day2 ),thereafter ( Day3 ) we will be focus on catering to the specifics of local langauge needs, in this case Swedish. We will give recommandations which can be optionally applied to your workflow and include some practical, useful scripts to help you kick-start your own journey in training local langauge Megatron GPT2/3 models. \n" ] }, { "cell_type": "markdown", "id": "fifth-argument", "metadata": {}, "source": [ "\n", "\n", "### Bootcamp Outline ( Day 2 )\n", "This is day 2 of the bootcamp ,we are focusing on familiarize ourselves with the Megatron default workflow,\n", "given the superPOD environment with ? gpus / per attendees. \n", "We will quickly get ourselves up and running with [Megatron repo](https://github.com/NVIDIA/Megatron-LM) and aiming to understand how to utilize gpus performance via experimenting on various Megatron GPT training configuration. \n", "\n", "- [Estimate hours/days needed to execute one end-to-end run per Megatron configuration](./Day2-1_EstimateComputeDaysNeeded.ipynb)\n", "- [Understanding the core of Megatron - mpu ](./Day2-2_MegatronFundementals.ipynb)\n", "- [About GPT's tokenizer](./Day2-3_GPT_vocab_merge_files.ipynb)\n", "- [Data preprocessing](./Day2-4_jsonfy_and_process2mmap.ipynb)\n", "- [Megatron runs vs config](./BootCampDay2-4_Verify_GPT_runs_locally.ipynb)\n", "\n", "\n", "### Tutorial Duration\n", "The lab material will be presented in an 4-hour session. A Link to the scripts (without the data) is available for download at the end of the bootcamp.\n", "\n", "### Content Level\n", "Intermediate , advanced \n", "\n", "### Target Audience and Prerequisites\n", "The target audience for this lab are NLP researchers, data scientists and NLP engineers who are interested in adopting Megatron to train their own GPT2/3 models on their own langauge.\n", "\n", "Basic experience with Python programming is needed. No GPU programming knowledge is required." ] }, { "cell_type": "markdown", "id": "fleet-subject", "metadata": {}, "source": [ "---\n", "## Up Next : \n", "\n", "[Estimate Compute hours/ Days Needed](./Day2-1_EstimateComputeDaysNeeded.ipynb)\n", "\n", "## Back To Start Menu\n", "[start menu](../Start_Here.ipynb)" ] }, { "cell_type": "markdown", "id": "accessible-palestine", "metadata": {}, "source": [ "---\n", "\n", "## Licensing \n", "\n", "This material is released by OpenACC-Standard.org, in collaboration with NVIDIA Corporation, under the Creative Commons Attribution 4.0 International (CC BY 4.0). " ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.8" } }, "nbformat": 4, "nbformat_minor": 5 }