|
5 年之前 | |
---|---|---|
.. | ||
README.md | 5 年之前 | |
trt-client.yaml | 5 年之前 |
How to Run Nvidia's TensorRT Inference Server
Clone the repo
git clone https://github.com/NVIDIA/tensorrt-inference-server.git
Download models
cd tensorrt-inference-server/docs/examples/
./fetch_models.sh
Copy models to shared NFS location
cp -rp model_repository ensemble_model_repository /home/k8sSHARE
Deploy Prometheus and Grafana
Prometheus collects metrics for viewing in Grafana. Install the prometheus-operator for these components. The serviceMonitorSelectorNilUsesHelmValues flag is needed so that Prometheus can find the inference server metrics in the example release deployed below:
helm install --name example-metrics --set prometheus.prometheusSpec.serviceMonitorSelectorNilUsesHelmValues=false stable/prometheus-operator
Setup port-forward to the Grafana service for local access:
kubectl port-forward service/example-metrics-grafana 8080:80
Navigate in your browser to localhost:8080 for the Grafana login page.
username=admin password=prom-operator