Prechádzať zdrojové kódy

Issue #1012 :Visualization Documentation out of date

Signed-off-by: Jon Hass <Jon_Hass@Dell.com>
jonhass 3 rokov pred
rodič
commit
426db75111
33 zmenil súbory, kde vykonal 144 pridanie a 49 odobranie
  1. 30 0
      control_plane/roles/control_plane_k8s/files/startup_omnia.yml
  2. BIN
      docs/Telemetry_Visualization/Images/MultiFactorVisualizationDashboard.png
  3. BIN
      docs/Telemetry_Visualization/Images/MultiFactorVisualizationDashboard_Filter.png
  4. BIN
      docs/Telemetry_Visualization/Images/MultiFactorVisualizationDashboard_Interact.png
  5. BIN
      docs/Telemetry_Visualization/Images/ParallelCoordinates_DoubleMetricFiltering.png
  6. BIN
      docs/Telemetry_Visualization/Images/ParallelCoordinates_InitialView_Collapsed.png
  7. BIN
      docs/Telemetry_Visualization/Images/ParallelCoordinates_InitialView_Expanded.png
  8. BIN
      docs/Telemetry_Visualization/Images/ParallelCoordinates_MetricFiltering.png
  9. BIN
      docs/Telemetry_Visualization/Images/ParallelCoordinates_NodeSelection.png
  10. BIN
      docs/Telemetry_Visualization/Images/ParallelCoordinates_Recoloration.png
  11. BIN
      docs/Telemetry_Visualization/Images/ParallelCoordinates_TimeFiltering.png
  12. BIN
      docs/Telemetry_Visualization/Images/ParallelCoordinates_TopLeftPanel_NodeHighlight.png
  13. BIN
      docs/Telemetry_Visualization/Images/PowerMaps_Hover.png
  14. BIN
      docs/Telemetry_Visualization/Images/PowerMaps_HoverJobs.png
  15. BIN
      docs/Telemetry_Visualization/Images/PowerMaps_InitialView.png
  16. BIN
      docs/Telemetry_Visualization/Images/PowerMaps_SelectMetric.png
  17. BIN
      docs/Telemetry_Visualization/Images/PowerMaps_Zoom.png
  18. BIN
      docs/Telemetry_Visualization/Images/SankeyLayout_EditMode.png
  19. BIN
      docs/Telemetry_Visualization/Images/SankeyLayout_HoverFreeze.png
  20. BIN
      docs/Telemetry_Visualization/Images/SankeyLayout_InitialView.png
  21. BIN
      docs/Telemetry_Visualization/Images/SankeyLayout_LeftPanel.png
  22. BIN
      docs/Telemetry_Visualization/Images/SankeyLayout_Zoom.png
  23. BIN
      docs/Telemetry_Visualization/Images/SpiralLayout_EditBehaviourPanel.png
  24. BIN
      docs/Telemetry_Visualization/Images/SpiralLayout_EditPanel.png
  25. BIN
      docs/Telemetry_Visualization/Images/SpiralLayout_HeatMaps.png
  26. BIN
      docs/Telemetry_Visualization/Images/SpiralLayout_InitialView.png
  27. BIN
      docs/Telemetry_Visualization/Images/SpiralLayout_SelectMetric.png
  28. 4 21
      docs/Telemetry_Visualization/TELEMETRY.md
  29. 13 28
      docs/Telemetry_Visualization/VISUALIZATION.md
  30. 33 0
      docs/Telemetry_Visualization/Visualizations/ParallelCoordinates.md
  31. 18 0
      docs/Telemetry_Visualization/Visualizations/PowerMaps.md
  32. 24 0
      docs/Telemetry_Visualization/Visualizations/SankeyLayout.md
  33. 22 0
      docs/Telemetry_Visualization/Visualizations/SpiralLayout.md

+ 30 - 0
control_plane/roles/control_plane_k8s/files/startup_omnia.yml

@@ -27,6 +27,7 @@
     cobbler_kickstart_file: rocky8.ks
     management_network_namespace: network-config
     management_network_pod: mngmnt-network-container
+    infiniband_pod: infiniband-container
     file_perm: '0775'
     mount_dir: /mnt/temp/
   tasks:
@@ -119,6 +120,10 @@
       retries: "{{ max_retries }}"
       until: "'master' in k8s_nodes.stdout"
 
+    - name: Restart coredns pod
+      command: kubectl rollout restart deployment.apps/coredns -n kube-system
+      changed_when: true
+
     - block:
         - name: Check mngmnt_network pod status
           command: kubectl get pods -n {{ management_network_namespace }}
@@ -145,6 +150,31 @@
           when: management_network_pod in mngmnt_network_pod_status.stdout
       when: device_config_support
 
+    - block:
+        - name: Check mngmnt_network pod status
+          command: kubectl get pods -n {{ management_network_namespace }}
+          changed_when: false
+          register: mngmnt_network_pod_status
+          failed_when: false
+
+        - name: Wait for infiniband pod to come to ready state
+          command: kubectl wait --for=condition=ready -n {{ management_network_namespace }} pod -l app=infiniband
+          changed_when: false
+          when: infiniband_pod in mngmnt_network_pod_status.stdout
+
+        - name: Get infiniband pod name
+          command: 'kubectl get pod -n {{ management_network_namespace }} -l app=infiniband -o jsonpath="{.items[0].metadata.name}"'
+          changed_when: false
+          register: infiniband_pod_name
+          when: infiniband_pod in mngmnt_network_pod_status.stdout
+
+        - name: Configuring infiniband container
+          command: 'kubectl exec --stdin --tty -n {{ management_network_namespace }} {{ infiniband_pod_name.stdout }} \
+            -- ansible-playbook /root/omnia/control_plane/roles/control_plane_ib/files/infiniband_container_configure.yml'
+          changed_when: false
+          when: infiniband_pod in mngmnt_network_pod_status.stdout
+      when: ib_switch_support
+
     - name: Check cobbler pod status
       command: kubectl get pods -n {{ cobbler_namespace }}
       changed_when: false

BIN
docs/Telemetry_Visualization/Images/MultiFactorVisualizationDashboard.png


BIN
docs/Telemetry_Visualization/Images/MultiFactorVisualizationDashboard_Filter.png


BIN
docs/Telemetry_Visualization/Images/MultiFactorVisualizationDashboard_Interact.png


BIN
docs/Telemetry_Visualization/Images/ParallelCoordinates_DoubleMetricFiltering.png


BIN
docs/Telemetry_Visualization/Images/ParallelCoordinates_InitialView_Collapsed.png


BIN
docs/Telemetry_Visualization/Images/ParallelCoordinates_InitialView_Expanded.png


BIN
docs/Telemetry_Visualization/Images/ParallelCoordinates_MetricFiltering.png


BIN
docs/Telemetry_Visualization/Images/ParallelCoordinates_NodeSelection.png


BIN
docs/Telemetry_Visualization/Images/ParallelCoordinates_Recoloration.png


BIN
docs/Telemetry_Visualization/Images/ParallelCoordinates_TimeFiltering.png


BIN
docs/Telemetry_Visualization/Images/ParallelCoordinates_TopLeftPanel_NodeHighlight.png


BIN
docs/Telemetry_Visualization/Images/PowerMaps_Hover.png


BIN
docs/Telemetry_Visualization/Images/PowerMaps_HoverJobs.png


BIN
docs/Telemetry_Visualization/Images/PowerMaps_InitialView.png


BIN
docs/Telemetry_Visualization/Images/PowerMaps_SelectMetric.png


BIN
docs/Telemetry_Visualization/Images/PowerMaps_Zoom.png


BIN
docs/Telemetry_Visualization/Images/SankeyLayout_EditMode.png


BIN
docs/Telemetry_Visualization/Images/SankeyLayout_HoverFreeze.png


BIN
docs/Telemetry_Visualization/Images/SankeyLayout_InitialView.png


BIN
docs/Telemetry_Visualization/Images/SankeyLayout_LeftPanel.png


BIN
docs/Telemetry_Visualization/Images/SankeyLayout_Zoom.png


BIN
docs/Telemetry_Visualization/Images/SpiralLayout_EditBehaviourPanel.png


BIN
docs/Telemetry_Visualization/Images/SpiralLayout_EditPanel.png


BIN
docs/Telemetry_Visualization/Images/SpiralLayout_HeatMaps.png


BIN
docs/Telemetry_Visualization/Images/SpiralLayout_InitialView.png


BIN
docs/Telemetry_Visualization/Images/SpiralLayout_SelectMetric.png


+ 4 - 21
docs/Telemetry_Visualization/TELEMETRY.md

@@ -7,26 +7,9 @@ A lot of these metrics are collected using iDRAC telemetry. iDRAC telemetry allo
 ## Prerequisites
 
 1. To set up Grafana, ensure that `control_plane/input_params/login_vars.yml` is updated with the Grafana Username and Password.
-2. All parameters in `telemetry/input_params/telemetry_login_vars.yml` need to be filled in:
-
-| Parameter Name        | Default Value | Information |
-|-----------------------|---------------|-------------|
-| timescaledb_user      | 		        |  Username used for connecting to timescale db. Minimum Length: 2 characters.          |
-| timescaledb_password  | 		        |  Password used for connecting to timescale db. Minimum Length: 2 characters.           |
-| mysqldb_user          | 		        |  Username used for connecting to mysql db. Minimum Length: 2 characters.         |
-| mysqldb_password      | 		        |  Password used for connecting to mysql db. Minimum Length: 2 characters.            |
-| mysqldb_root_password | 		        |  Password used for connecting to mysql db for root user. Minimum Legth: 2 characters.         |
-
-3. All parameters in `telemetry/input_params/telemetry_base_vars.yml` need to be filled in:
-
-| Parameter Name          | Default Value     | Information |
-|-------------------------|-------------------|-------------|
-| idrac_telemetry_support | true              | This variable is used to enable iDRAC telemetry support and visualizations. Accepted Values: true/false            |
-| slurm_telemetry_support | true              | This variable is used to enable slurm telemetry support and visualizations. Slurm Telemetry support can only be activated when idrac_telemetry_support is set to true. Accepted Values: True/False.        |
-| timescaledb_name        | telemetry_metrics | Postgres DB with timescale extension is used for storing iDRAC and slurm telemetry metrics.            |
-| mysqldb_name			  | idrac_telemetrysource_services_db | MySQL DB is used to store IPs and credentials of iDRACs having datacenter license           |
-
-3. Find the IP of the Grafana UI using:
+2. All [parameters](../Input_Parameter_Guide/Telemetry_Visualization_Parameters/telemetry_login_vars.md) in `telemetry/input_params/telemetry_login_vars.yml` need to be filled in.
+3. All [parameters](../Input_Parameter_Guide/Telemetry_Visualization_Parameters/telemetry_base_vars.md) in `telemetry/input_params/telemetry_base_vars.yml` need to be filled in.
+4. Find the IP of the Grafana UI using:
  
 `kubectl get svc -n grafana`
 
@@ -60,4 +43,4 @@ Use any one of the following browsers to access the Grafana UI (https://< Grafan
 ## Adding a New Node to Telemetry
 After initiation, new nodes can be added to telemetry by running the following commands from `omnia/telemetry`:
 		
-` ansible-playbook add_idrac_node.yml`
+`ansible-playbook add_idrac_node.yml`

+ 13 - 28
docs/Telemetry_Visualization/VISUALIZATION.md

@@ -8,37 +8,22 @@ Once `control_plane.yml` is executed and Grafana is set up, use `telemetry.yml`
 
 ## All your data in a glance
 
-Using the following graphs, data can be visualized to gather correlational information. These graphs refresh every 5 seconds (Except SankeyViewer). 
+Using the following graphs, data can be visualized to gather correlational information.
+1. [Parallel Coordinates](Visualizations/ParallelCoordinates.md)
+2. [Sankey Layout](Visualizations/SankeyLayout.md)
+3. [Spiral Layout](Visualizations/SpiralLayout.md)
+4. [Power Map](Visualizations/PowerMaps.md)
 
->> __Note:__ The timestamps used for the time metric are based on the `timezone` set in `control_plane/input_params/base_vars.yml`. 
+>> __Note:__ The timestamps used for the time metric are based on the `timezone` set in `control_plane/input_params/base_vars.yml`.  In the event of a mismatch between the timezone on the browser being used to access Grafana UI and the timezone in `base_vars.yml`, the time range being used to filter information on the Grafana UI will have to be adjusted per the timezone in `base_vars.yml`.
 
-1. [Parallel Coordinates](https://idatavisualizationlab.github.io/HPCC/#ParallelCoordinates) <br>
-Parallel coordinates are a great way to capture a systems status. It shows all ranges of individual metrics like CPU temp, Fan Speed, Memory Usage etc. The graph can be narrowed by time or metric ranges to get specific correlations such as CPU Temp vs Fan Speed etc.
+### The Multi-factor Visualization Dashboard
+The Multi-factor Visualization Dashboard has 4 interactive visualization panels that allow you to see all the graphs mentioned above in a single view.
+![Multi Factor Visualization Dashboard](Images/MultiFactorVisualizationDashboard.png)
 
-![Parallel Coordinates](Images/ParallelCoordinates.png)
+Using the Node and User dropdowns on the left, nodes and users can be filtered to collect data within a given time-frame (Select the time frame on the top-right of the view).
+![Multi Factor Visualization ](Images/MultiFactorVisualizationDashboard_Filter.png)
 
-<br>
+To interact with a specific panel, click on the __Panel Name__ and then select the __View__ option from the dropdown menu.
+![img.png](Images/MultiFactorVisualizationDashboard_Interact.png)
 
-2. [Spiral Layout](https://idatavisualizationlab.github.io/HPCC/#Spiral_Layout) <br>
-Spiral Layouts are best for viewing the change in a single metric over time. It is often used to check trends in metrics over a business day. Data visualized in this graph can be sorted using other metrics like Job IDs etc to understand the pattern of utilization on your devices.
-
-![Spiral Layout](Images/Spirallayout.gif)
-
-<br>
-
-3. [Sankey Viewer](https://idatavisualizationlab.github.io/HPCC/#SankeyViewer) <br>
-Sankey Viewers are perfect for viewing utilization by nodes/users/jobs. It provides point in time information for quick troubleshooting.
-
->> __Note:__ Due to the tremendous data processing undertaken by SankeyViewer, the graph does not auto-refresh. It can be manually refreshed by refreshing the internet tab or by clicking the refresh button on the top-right corner of the page.
-
-![Sankey Viewer](Images/SankeyViewer.png)
-
-<br>
-
-4. [Power Map](https://idatavisualizationlab.github.io/HPCC/#PowerMap) <br>
-Power Maps are an excellent way to see utilization along the axis of time for different nodes/users/jobs. Hovering over the graph allows the user to narrow down information by Job/User or Node.
-
-![Power Map](Images/PowerMap.png)
-
-<br>
 

Rozdielové dáta súboru neboli zobrazené, pretože súbor je príliš veľký
+ 33 - 0
docs/Telemetry_Visualization/Visualizations/ParallelCoordinates.md


+ 18 - 0
docs/Telemetry_Visualization/Visualizations/PowerMaps.md

@@ -0,0 +1,18 @@
+# Power Maps
+A PowerMap diagram is a visualization used to depict the relationship between Users, Jobs, and Computes.  It can be used to identify heavy or malfunctioning jobs that could be choking resources. This graph requires that both iDRAC and slurm telemetry are enabled
+
+![img.png](../Images/PowerMaps_InitialView.png)
+>> __Note:__ In the above image, the arrow on the left can be used to expand the left panel and customize the graph
+
+![img.png](../Images/PowerMaps_SelectMetric.png)
+>> __Note:__ In the above image, the left panel is used to select the metric __Memory Power__ as the metric to build the power map on. The panel can also be used to change the threshold setting. The threshold is a value (often the mean or median value) based on which the graph points are colored. <br> For example: The threshold above is set to 193.33. Values above the threshold are colored in orange whereas the values below are colored in blue.
+
+![img.png](../Images/PowerMaps_Hover.png)
+>> __Note:__ In the above image, clicking or hovering over a specific node highlights the node, the jobs associated and the relevant users within the specified time range.
+
+![img.png](../Images/PowerMaps_HoverJobs.png)
+>> __Note:__ In the above image, clicking or hovering over a specific job highlights the nodes and users associated with the job.
+
+![img.png](../Images/PowerMaps_Zoom.png)
+>> __Note:__ In the above image, the view has been repositioned by clicking and dragging. The view can also be zoomed into by scrolling forwards. Scroll backwards to zoom out.
+

Rozdielové dáta súboru neboli zobrazené, pretože súbor je príliš veľký
+ 24 - 0
docs/Telemetry_Visualization/Visualizations/SankeyLayout.md


+ 22 - 0
docs/Telemetry_Visualization/Visualizations/SpiralLayout.md

@@ -0,0 +1,22 @@
+# Spiral Layout
+Spiral Layouts are best for viewing the change in a single metric over time. The spiral organization of node representation can represent a large number (100s to 1000s) of compute nodes in a compact visual. Nodes can be ordered on the spiral by rank per metric value or by metric value.  Hovering over a node will display a heatmap of the node metric value over the dataset time-range.
+
+![img.png](../Images/SpiralLayout_InitialView.png)
+>> __Note:__ In the above image, the spiral visualization displays compute nodes on a spiral graphing layout. This example orders the compute nodes by __Power Consumption__ at the time indicated by the time range slider.
+
+![img.png](../Images/SpiralLayout_SelectMetric.png)
+>> __Note:__ In the above image, all compute nodes are arranged on the spiral graph by their ranking order. The dropdown on the left is used to select what metric is shown.
+
+![img.png](../Images/SpiralLayout_HeatMaps.png)
+>> __Note:__ In the above image, a heat map of the metric for that node is displayed for the data set time range selected. Hovering over a node in the graph displays node information on the right. Click on the graph to toggle between freezing and un-freezing the graph.
+
+![img.png](../Images/SpiralLayout_EditPanel.png)
+>> __Note:__ In the above image, behaviour of the Spiral Layout view can be updated using the __Edit__ option from the highlighted dropdown.
+
+![img.png](../Images/SpiralLayout_EditBehaviourPanel.png)
+>>__Note:__ In the above image, the edit panel offers the option to:
+>> 1. Change the order type
+>> 2. Change the number of rings displayed
+>> 3. Change the Node size on the graph
+
+