bharatk-parallel fda30a9738 README Update, Presentation folders, References to Other Lab, Deleted RAPIDS Solution 3 lat temu
..
English fda30a9738 README Update, Presentation folders, References to Other Lab, Deleted RAPIDS Solution 3 lat temu
.gitignore ba5daddf9c Added CFD and Climate AI for Science labs 4 lat temu
Dockerfile 6ca5fc72e3 Changed Download links , Changed seed from numpy to tensorflow , Added support for A100 3 lat temu
README.MD fda30a9738 README Update, Presentation folders, References to Other Lab, Deleted RAPIDS Solution 3 lat temu
Singularity 7b3fea1d7f Modified Singularity container 3 lat temu

README.MD

openacc-training-materials

This repository contains mini applications for GPU Bootcamps. The objective of this bootcamp is to give an introduction to application of Artificial Intelligence (AI) algorithms in Science ( High Performance Computing(HPC) Simulations ). This Bootcamp will introduce fundamentals of AI and how they can be applied to CFD (Computational Fluid Dynamics)

  • Introduction to AI and Convolution Neural Network with Keras
  • Using AI for Steady State Flow using Neural Networks

Target Audience:

The target audience for this bootcamp are researchers/graduate students and developers who are new to field of Artifical Intelligence and interested in learning about how it can be applied to Simulation domains like Computational Fluid Dynamics. Basic Python programming knowledge is required.

Tutorial Duration

The overall bootcamp will take approximate 3 hours. There is an additional mini-challenge provided at the end of bootcamp.

Prerequisites:

To run this tutorial you will need a machine with NVIDIA GPU.

Make sure both Docker and Singularity has been installed with NVIDIA GPU support

Creating containers

To start with, you will have to build a Docker or Singularity container.

Docker Container

To build a docker container, run: sudo docker build --network=host -t <imagename>:<tagnumber> .

For instance: sudo docker build --network=host -t myimage:1.0 .

and to run the container, run: sudo docker run --rm -it --gpus=all --network=host -p 8888:8888 myimage:1.0

The container launches jupyter notebook and runs on port 8888 jupyter notebook --ip 0.0.0.0 --port 8888 --no-browser --allow-root

Then, open the jupyter notebook in browser: http://localhost:8888 Start working on the lab by clicking on the Start_Here.ipynb notebook.

Singularity Container

To build the singularity container, run: sudo singularity build <image_name>.simg Singularity

and copy the files to your local machine to make sure changes are stored locally: singularity run <image_name>.simg cp -rT /workspace ~/workspace

Then, run the container: singularity run --nv <image_name>.simg jupyter notebook --notebook-dir=~/workspace/python/jupyter_notebook

Then, open the jupyter notebook in browser: http://localhost:8888 Start working on the lab by clicking on the Start_Here.ipynb notebook.

Troubleshooting

Q. cuDNN failed to initialize or GPU out of memory error

A. This error occurs when the user forgot to shutdown the jupyter kernel of previously run notebooks. Please make sure that all the previous notebook jupyter kernel is shutdown. ( Go to Home Tab --> Click Running Tab--> Kill notebooks that aren’t being used )

Q. Cannot write to /tmp directory

A. Some notebooks depend on writing logs to /tmp directory. While creating container make sure /tmp director is accesible with write permission to container. Else the user can also change the tmp directory location

Q. "ResourceExhaustedError" error is observed while running the labs A. Currently the batch size and network model is set to consume 16GB GPU memory. In order to use the labs without any modifications it is recommended to have GPU with minimum 16GB GPU memory. Else the users can play with batch size to reduce the memory footprint

Questions?

  • If you observe any errors, please file an issue on Github.
  • Also join OpenACC Slack Channel for general queries related to Hackathons and Bootcamps.