|
@@ -1 +1,27 @@
|
|
|
-Some instructions for running Nvidia's TensorRT Inference Server
|
|
|
+How to Run Nvidia's TensorRT Inference Server
|
|
|
+
|
|
|
+Clone the repo
|
|
|
+
|
|
|
+````git clone https://github.com/NVIDIA/tensorrt-inference-server.git````
|
|
|
+
|
|
|
+Download models
|
|
|
+
|
|
|
+````cd tensorrt-inference-server/docs/examples/````
|
|
|
+````./fetch_models.sh````
|
|
|
+
|
|
|
+Copy models to shared NFS location
|
|
|
+
|
|
|
+````cp -rp model_repository ensemble_model_repository /home/k8sSHARE````
|
|
|
+
|
|
|
+Deploy Prometheus and Grafana
|
|
|
+
|
|
|
+Prometheus collects metrics for viewing in Grafana. Install the prometheus-operator for these components. The serviceMonitorSelectorNilUsesHelmValues flag is needed so that Prometheus can find the inference server metrics in the example release deployed below:
|
|
|
+
|
|
|
+````helm install --name example-metrics --set prometheus.prometheusSpec.serviceMonitorSelectorNilUsesHelmValues=false stable/prometheus-operator````
|
|
|
+
|
|
|
+Setup port-forward to the Grafana service for local access:
|
|
|
+
|
|
|
+````kubectl port-forward service/example-metrics-grafana 8080:80````
|
|
|
+
|
|
|
+Navigate in your browser to localhost:8080 for the Grafana login page.
|
|
|
+````username=admin password=prom-operator````
|