Ensure that all the prequisites listed in the PREINSTALL_OMNIA_APPLIANCE file are met before installing Omnia appliance
Note: Changing the manager node after installation of Omnia is not supported by Omnia. If you want to change the manager node, you must redeploy the entire cluster.
Note: The user should have root privileges to perform installations and configurations using Omnia. Note: If there are errors when any of the following Ansible playbook commands are executed, re-run the commands again.
Clone the Omnia repository.
$ git clone https://github.com/dellhpc/omnia.git
Note: After the Omnia repository is cloned, a folder named omnia is created. It is recommended that you do not rename this folder.
Change the directory to omnia/appliance
To provide passwords for Cobbler and AWX, edit the appliance_config.yml
file.
If user want to provide the mapping file for DHCP configuration, go to appliance_config.yml file there is variable name mapping_file_exits set as true otherwise false.
Omnia considers the following usernames as default:
cobbler
for Cobbler Serveradmin
for AWXslurm
for MariaDBNote:
appliance_config.yml
file, you can also change the NIC for the DHCP server under hpc_nic and the NIC used to connect to the Internet under public_nic. Default values of both hpc_nic and public_nic is set to em1 and em2 respectively.omnia_config.yml
file.Note:
To view the set passwords of appliance_config.yml
at a later time, run the following command under omnia->appliance:
ansible-vault view appliance_config.yml --vault-password-file .vault_key
To view the set passwords of omnia_config.yml
at a later time, run the following command:
ansible-vault view omnia_config.yml --vault-password-file .omnia_vault_key
ansible-playbook appliance.yml -e "ansible_python_interpreter=/usr/bin/python2"
Omnia creates a log file which is available at: /var/log/omnia.log
.
Provision operating system on the target nodes
Omnia role used: provision
Ports used by Cobbler
To create the Cobbler image, Omnia configures the following:
To access the Cobbler dashboard, enter https://<IP>/cobbler_web
where <IP>
is the Global IP address of the management node. For example, enter
https://100.98.24.225/cobbler_web
to access the Cobbler dashboard.
Note: If a mapping file is not provided, the hostname to the server is given on the basis of following format: compute- where "xxx" is the last 2 octets of Host Ip address
After the Cobbler Server provisions the operating system on the nodes, IP addresses and host names are assigned by the DHCP service. The host names are assigned based on the following format: compute<xxx>-xxx where xxx is the Host ID (last 2 octet) of the Host IP address. For example, if the Host IP address is 172.17.0.11 then assigned hostname will be compute0-11.
Note: If a mapping file is provided, the hostnames follow the format provided in the mapping file.
Install and configure Ansible AWX Omnia performs the following configuration on AWX: To access the AWX dashboard, enter *Note: The AWX configurations are automatically performed Omnia and Dell Technologies recommends that you do not change the default configurations provided by Omnia as the functionality may be impacted. Note: Although AWX UI is accessible, hosts will be shown only after few nodes have been provisioned by a cobbler. It will take approx 10-15 mins. If any server is provisioned but user is not able to see any host on the AWX UI, then user can run provision_report.yml playbook from omnia -> appliance ->tools folder to see which hosts are reachable. Kubernetes and Slurm are installed by deploying the DeployOmnia template on the AWX dashboard. Note: To establish the passwordless communication between compute nodes and manager node: Note: If you want to install JupyterHub and Kubeflow playbooks, you have to first install the JupyterHub playbook and then install the Kubeflow playbook. Note: To install JupyterHub and Kubeflow playbooks: The DeployOmnia template may not run successfully if: After DeployOmnia template is executed from the AWX UI, the omnia.yml file installs Kubernetes and Slurm, or either Kubernetes or slurm, as per the selection in the template on the management node. Additionally, appropriate roles are assigned to the compute and manager groups. The following kubernetes roles are provided by Omnia when omnia.yml file is executed: Note: After Kubernetes is installed and configured, few Kubernetes and calico/flannel related ports will be opened in the manager and compute nodes. This is required for Kubernetes Pod-to-Pod and Pod-to-Service communications. Calico/flannel provides a full networking stack for Kubernetes pods. The following Slurm roles are provided by Omnia when omnia.yml file is executed: If a new node is provisioned through Cobbler, the node address is automatically displayed in AWX UI. This node does not belong to any group. The user can add the node to the compute group and execute omnia.yml to add the new node to the cluster and update the configurations in the manager node.
Omnia role used: web_ui
Port used by AWX is 8081.
AWX repository is cloned from the GitHub path: https://github.com/ansible/awx.git
http://<IP>:8081
where <IP> is the Global IP address of the management node. For example, enter http://100.98.24.225:8081
to access the AWX dashboard.Install Kubernetes and Slurm using AWX UI
slurm
and select Create "slurm". Similarly, to install only Slurm, select and add kubernetes
skip tag.
/home/k8snfs
, is created. Using this directory, compute nodes share the common files.
tcp_ports: 6817,6818,6819
udp_ports: 6817,6818,6819
Adding a new compute node to the Cluster