John Lockman 4355f72404 Update README.md		5 anni fa
..
README.md	4355f72404 Update README.md	5 anni fa
trt-client.yaml	242a10c3d3 adding tensorRT client example	5 anni fa

Run Nvidia's TensorRT Inference Server on omnia

Clone the repo

git clone https://github.com/NVIDIA/tensorrt-inference-server.git

Download models

cd tensorrt-inference-server/docs/examples/ ./fetch_models.sh

Copy models to shared NFS location

cp -rp model_repository ensemble_model_repository /home/k8sSHARE

Fix permissions on model files

chmod -R a+r /home/k8sSHARE/model_repository

Deploy Prometheus and Grafana

Prometheus collects metrics for viewing in Grafana. Install the prometheus-operator for these components. The serviceMonitorSelectorNilUsesHelmValues flag is needed so that Prometheus can find the inference server metrics in the example release deployed below:

helm install --name example-metrics --set prometheus.prometheusSpec.serviceMonitorSelectorNilUsesHelmValues=false stable/prometheus-operator

Setup port-forward to the Grafana service for local access

kubectl port-forward service/example-metrics-grafana 8080:80

Navigate in your browser to localhost:8080 for the Grafana login page. username=admin password=prom-operator

Setup TensorRT Inference Server Deployment

Change to helm chart directory cd ~/tensorrt-inference-server/deploy/single_server/

Modify values.yaml changing modelRepositoryPath

image:
  imageName: nvcr.io/nvidia/tensorrtserver:20.01-py3
  pullPolicy: IfNotPresent
  #modelRepositoryPath: gs://tensorrt-inference-server-repository/model_repository
  modelRepositoryPath: /data/model_repository
  numGpus: 1

Modify templates/deployment.yaml in bold to add the local NFS mount:

...

spec:
  containers:
    - name: {{ .Chart.Name }}
      image: "{{ .Values.image.imageName }}"
      imagePullPolicy: {{ .Values.image.pullPolicy }}
     <b style='background-color:yellow'> volumeMounts:
        - mountPath: /data/
          name: work-volume</b>

... volumes:

  - name: work-volume
    hostPath:
      # directory locally mounted on host
      path: /home/k8sSHARE
      type: Directory

Deploy the inference server

cd ~/tensorrt-inference-server/deploy/single_server/
helm install --name example .

Verify deployment

helm ls
NAME            REVISION    UPDATED                     STATUS      CHART                           APP VERSION NAMESPACE
example         1           Wed Feb 26 15:46:18 2020    DEPLOYED    tensorrt-inference-server-1.0.0 1.0         default  
example-metrics 1           Tue Feb 25 17:45:54 2020    DEPLOYED    prometheus-operator-8.9.2       0.36.0      default

kubectl get pods
NAME                                                     READY   STATUS    RESTARTS   AGE
example-tensorrt-inference-server-f45d865dc-62c46        1/1     Running   0          53m

kubectl get svc
NAME                                        TYPE           CLUSTER-IP       EXTERNAL-IP      PORT(S)                                        AGE
...
example-tensorrt-inference-server           LoadBalancer   10.150.77.138    192.168.60.150   8000:31165/TCP,8001:31408/TCP,8002:30566/TCP   53m

Setup NGC login secret for nvcr.io

kubectl create secret docker-registry <your-secret-name> --docker-server=<your-registry-server> --docker-username=<your-registry-username> --docker-password=<your-registry-apikey> --docker-email=<your-email>

Parameter Description: docker-registry – the name you will use for this secret docker-server – nvcr.io is the container registry for NGC docker-username – for nvcr.io this is ‘$oauthtoken’ (including quotes) docker-password – this is the API Key you obtained earlier docker-email – your NGC email address

Example (you will need to generate your own oauth token) kubectl create secret docker-registry ngc-secret --docker-server=nvcr.io --docker-username='$oauthtoken' --docker-password=clkaw309f3jfaJ002EIVCJAC0Cpcklajser90wezxc98wdn09ICJA09xjc09j09JV00JV0JVCLR0WQE8ACZz --docker-email=john@example.com

Verify your secret has been stored:

kubectl get secrets
NAME                                                          TYPE                                  DATA   AGE
...
ngc-secret                                                    kubernetes.io/dockerconfigjson        1      106m

Run TensorRT Client

kubectl apply -f trt-client.yaml

Verify it is running:

kubectl get pod tensorrt-client
NAME              READY   STATUS    RESTARTS   AGE
tensorrt-client   1/1     Running   0          5m

Run the inception test using the client Pod. The TensorRT Inference IP Address can be found by running kubectl get svc

kubectl exec -it tensorrt-client -- /bin/bash -c "image_client -u 192.168.60.150:8000 -m resnet50_netdef -s INCEPTION images/mug.jpg"
Request 0, batch size 1
Image 'images/mug.jpg':

504 (COFFEE MUG) = 0.723992




Run inception test with batch size 2 and print top 3 classifications
 kubectl exec -it tensorrt-client -- /bin/bash -c "image_client  -u 192.168.60.150:8000 -m resnet50_netdef -s INCEPTION images/ -c 3 -b 2"
Request 0, batch size 2
Image 'images//mug.jpg':

504 (COFFEE MUG) = 0.723992
968 (CUP) = 0.270953
967 (ESPRESSO) = 0.00115996


Image 'images//mug.jpg':

504 (COFFEE MUG) = 0.723992
968 (CUP) = 0.270953
967 (ESPRESSO) = 0.00115996

README.md