Install Slurm and Kubernetes, along with all dependencies
ansible-playbook -i host_inventory_file omnia.yml
Install Slurm only
ansible-playbook -i host_inventory_file omnia.yml --skip-tags "k8s"
Install Kubernetes only
ansible-playbook -i host_inventory_file omnia.yml --skip-tags "slurm"
Initialize Kubernetes cluster (packages already installed)
ansible-playbook -i host_inventory_file omnia.yml --skip-tags "slurm" --tags "init"
### Install Kubeflow
ansible-playbook -i host_inventory_file platforms/kubeflow.yml
# Omnia
Omnia is a collection of [Ansible](https://www.ansible.com/) playbooks which perform:
* Installation of [Slurm](https://slurm.schedmd.com/) and/or [Kubernetes](https://kubernetes.io/) on servers already provisioned with a standard [CentOS](https://www.centos.org/) image.
* Installation of auxiliary scripts for administrator functions such as moving nodes between Slurm and Kubernetes personalities.
Omnia playbooks perform several tasks:
`common` playbook handles installation of software
* Add yum repositories:
- Kubernetes (Google)
- El Repo (for Nvidia drivers)
- EPEL (Extra Packages for Enterprise Linux)
* Install Packages from repos:
- bash-completion
- docker
- gcc
- python-pip
- kubelet
- kubeadm
- kubectl
- nfs-utils
- nvidia-detect
- yum-plugin-versionlock
* Restart and enable system level services
- Docker
- Kubelet
`computeGPU` playbook installs Nvidia drivers and nvidia-container-runtime-hook
* Add yum repositories:
- Nvidia (container runtime)
* Install Packages from repos:
- kmod-nvidia
- nvidia-container-runtime-hook
* Restart and enable system level services
- Docker
- Kubelet
* Configuration:
- Enable GPU Device Plugins (nvidia-container-runtime-hook)
- Modify kubeadm config to allow GPUs as schedulable resource
* Restart and enable system level services
- Docker
- Kubelet
`master` playbook
* Install Helm v3
* (optional) add firewall rules for Slurm and kubernetes
Everything from this point on can be called by using the `init` tag
ansible-playbook -i host_inventory_file kubernetes/kubernetes.yml --tags "init" ```
startmaster
playbook
startworkers
playbook
startservices
playbook
stable
repo to helmjupyterhub
repo to helm