John Lockman 9fb395a4f5 Delete k8s-TensorFlow-resnet50-multinode-MPIOperator.yaml %!s(int64=2) %!d(string=hai) anos
..
PyTorch e5fce65e42 Update pytorch-deploy.yaml %!s(int64=2) %!d(string=hai) anos
TensorRT-InferenceServer 4355f72404 Update README.md %!s(int64=4) %!d(string=hai) anos
login_node_example 1db504b572 Issue#550: Syncing GitHub and GitLab %!s(int64=3) %!d(string=hai) anos
README.md 44fce1ea2b removed all instances of `master` from scripts and playbooks %!s(int64=4) %!d(string=hai) anos
device_ip_list.yml ef4b0d045b Create device_ip_list.yml %!s(int64=2) %!d(string=hai) anos
host_inventory_file 898d4c56db fix host inventory example %!s(int64=3) %!d(string=hai) anos
host_inventory_file.ini f3823798d5 updated inventory file examples %!s(int64=3) %!d(string=hai) anos
host_mapping_file_one_touch.csv e3363cdefb Create host_mapping_file_one_touch.csv %!s(int64=3) %!d(string=hai) anos
host_mapping_file_os_provisioning.csv 77f005f0a1 Create host_mapping_file_os_provisioning.csv %!s(int64=3) %!d(string=hai) anos
k8s-tensorflow-nvidia-ngc-resnet50-multinode-mpioperator.yaml 6e54ff6045 resolves issue #77 %!s(int64=4) %!d(string=hai) anos
mapping_device_file.csv 9a0c407eae Update and rename mapping_file.csv to mapping_device_file.csv %!s(int64=3) %!d(string=hai) anos
slurm-TensorFlow-resnet50-multinode-MPI.batch c599545fbf adding k8s and slurm submission examples %!s(int64=4) %!d(string=hai) anos

README.md

Examples

The examples K8s Submit and SLURM submit are provide as examples for running the resnet50 benchmark with TensorFlow on 8 GPUs using 2 C4140s.

Submitting the example

K8s

kubectl create -f k8s-TensorFlow-resnet50-multinode-MPIOperator.yaml

Slurm

sbatch slurm-TensorFlow-resnet50-multinode-MPI.batch