{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "     \n", "     \n", "     \n", "     \n", "     \n", "  \n", "[Home Page](Start_Here.ipynb)\n", " \n", " \n", "[Previous Notebook](Introduction_to_Deepstream_and_Gstreamer.ipynb)\n", "     \n", "     \n", "     \n", "    \n", "[1](Introduction_to_Deepstream_and_Gstreamer.ipynb#)\n", "[2]\n", "[3](Introduction_to_Multi-DNN_pipeline.ipynb)\n", "[4](Multi-stream_pipeline.ipynb)\n", "[5](Multi-stream_Multi_DNN.ipynb)\n", "     \n", "     \n", "     \n", "     \n", "[Next Notebook](Introduction_to_Multi-DNN_pipeline.ipynb)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Getting started with Deepstream pipeline\n", "\n", "In this notebook, you will be get started with DeepStream's Python Bindings ,it's workflow and build a 4-class object detection pipeline. \n", "\n", "\n", "**Contents of this Notebook :**\n", "\n", "- [NVIDIA DeepStream Plugins](#NVIDIA-DeepStream-Plugins) \n", " - [Nvinfer](#Nvinfer)\n", " - [Nvvidconv](#Nvvidconv)\n", " - [Nvosd](#Nvosd)\n", "- [Building the pipeline](#Building-the-pipeline)\n", "- [Understanding the configuration file](#Understanding-the-configuration-file)\n", "- [Working with the Metadata](#Working-with-the-Metadata) " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We will be building a 4-class object detection pipeline as shown in the illustration below. \n", "\n", "![Test1](images/test1.png)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We notice there are multiple DeepStream plugins used in the pipeline , Let us have a look at them and try to understand them. \n", "\n", "## NVIDIA DeepStream Plugins\n", "\n", "### Nvinfer\n", "\n", "The nvinfer plugin provides [TensorRT](https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html)-based inference for detection and tracking. The lowlevel library (libnvds_infer) operates either on float RGB or BGR planar data with dimensions of Network Height and Network Width. The plugin accepts NV12/RGBA data from upstream components like the decoder, muxer, and dewarper.\n", "The Gst-nvinfer plugin also performs preprocessing operations like format conversion, scaling, mean subtraction, and produces final float RGB/BGR planar data which is passed to the low-level library. The low-level library uses the TensorRT engine for inferencing. It outputs each classified object’s class and each detected object’s bounding boxes (Bboxes) after clustering.\n", "\n", "![NVINFER](images/nvinfer.png)\n", "\n", "### Nvvidconv \n", "\n", "We create the nvvidconv plugin that performs color format conversions, which is required to make data ready for the nvosd plugin.\n", "\n", "![NVVIDCONV](images/nvvidconv.png)\n", "\n", "\n", "### Nvosd\n", "\n", "The nvosd plugin draws bounding boxes, text, and RoI (Regions of Interest) polygons (Polygons are presented as a set of lines). The plugin accepts an RGBA buffer with attached metadata from the upstream component. It\n", "draws bounding boxes, which may be shaded depending on the configuration (e.g. width, color, and opacity) of a given bounding box. It also draws text and RoI polygons at specified locations in the frame. Text and polygon parameters are configurable through metadata.\n", "\n", "![NVOSD](images/nvosd.png)\n", "\n", "\n", "Now with this idea , let us get started into building the pipeline." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Building the pipeline \n", "\n", "![Test1](images/test1.png)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Import Required Libraries \n", "import sys\n", "sys.path.append('../source_code')\n", "import gi\n", "import time\n", "gi.require_version('Gst', '1.0')\n", "from gi.repository import GObject, Gst\n", "from common.bus_call import bus_call\n", "import pyds\n", "\n", "# Defining the Class Labels\n", "PGIE_CLASS_ID_VEHICLE = 0\n", "PGIE_CLASS_ID_BICYCLE = 1\n", "PGIE_CLASS_ID_PERSON = 2\n", "PGIE_CLASS_ID_ROADSIGN = 3\n", "\n", "# Defining the input output video file \n", "INPUT_VIDEO_NAME = '/opt/nvidia/deepstream/deepstream-5.0/samples/streams/sample_720p.h264'\n", "OUTPUT_VIDEO_NAME = \"../source_code/N1/ds_out.mp4\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We define a function `make_elm_or_print_err()` to create our elements and report any errors if the creation fails.\n", "\n", "Elements are created using the `Gst.ElementFactory.make()` function as part of Gstreamer library." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "## Make Element or Print Error and any other detail\n", "def make_elm_or_print_err(factoryname, name, printedname, detail=\"\"):\n", " print(\"Creating\", printedname)\n", " elm = Gst.ElementFactory.make(factoryname, name)\n", " if not elm:\n", " sys.stderr.write(\"Unable to create \" + printedname + \" \\n\")\n", " if detail:\n", " sys.stderr.write(detail)\n", " return elm" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Initialise GStreamer and Create an Empty Pipeline" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Standard GStreamer initialization\n", "GObject.threads_init()\n", "Gst.init(None)\n", "\n", "\n", "# Create gstreamer elements\n", "# Create Pipeline element that will form a connection of other elements\n", "print(\"Creating Pipeline \\n \")\n", "pipeline = Gst.Pipeline()\n", "\n", "if not pipeline:\n", " sys.stderr.write(\" Unable to create Pipeline \\n\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Create Elements that are required for our pipeline " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "########### Create Elements required for the Pipeline ########### \n", "# Source element for reading from the file\n", "source = make_elm_or_print_err(\"filesrc\", \"file-source\",\"Source\")\n", "# Since the data format in the input file is elementary h264 stream we need a h264parser\n", "h264parser = make_elm_or_print_err(\"h264parse\", \"h264-parser\",\"h264 parse\")\n", "# Use nvdec_h264 for hardware accelerated decode on GPU\n", "decoder = make_elm_or_print_err(\"nvv4l2decoder\", \"nvv4l2-decoder\",\"Nvv4l2 Decoder\")\n", "# Create nvstreammux instance to form batches from one or more sources.\n", "streammux = make_elm_or_print_err(\"nvstreammux\", \"Stream-muxer\",'NvStreamMux')\n", "# Use nvinfer to run inferencing on decoder's output,behaviour of inferencing is set through config file\n", "pgie = make_elm_or_print_err(\"nvinfer\", \"primary-inference\" ,\"pgie\")\n", "# Use convertor to convert from NV12 to RGBA as required by nvosd\n", "nvvidconv = make_elm_or_print_err(\"nvvideoconvert\", \"convertor\",\"nvvidconv\")\n", "# Create OSD to draw on the converted RGBA buffer\n", "nvosd = make_elm_or_print_err(\"nvdsosd\", \"onscreendisplay\",\"nvosd\")\n", "# Finally encode and save the osd output\n", "queue = make_elm_or_print_err(\"queue\", \"queue\", \"Queue\")\n", "# Use convertor to convert from NV12 to RGBA as required by nvosd\n", "nvvidconv2 = make_elm_or_print_err(\"nvvideoconvert\", \"convertor2\",\"nvvidconv2\")\n", "# Place an encoder instead of OSD to save as video file\n", "encoder = make_elm_or_print_err(\"avenc_mpeg4\", \"encoder\", \"Encoder\")\n", "# Parse output from Encoder \n", "codeparser = make_elm_or_print_err(\"mpeg4videoparse\", \"mpeg4-parser\", 'Code Parser')\n", "# Create a container\n", "container = make_elm_or_print_err(\"qtmux\", \"qtmux\", \"Container\")\n", "# Create Sink for storing the output \n", "sink = make_elm_or_print_err(\"filesink\", \"filesink\", \"Sink\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now that we have created the elements ,we can now set various properties for out pipeline at this point. \n", "\n", "### Understanding the configuration file \n", "\n", "We set an `config-file-path` for our nvinfer ( Interference plugin ) and it points to the file `dstest1_pgie_config.txt`\n", "\n", "You can have a have a look at the [file](../source_code/N1/dstest1_pgie_config.txt)\n", "\n", "Here are some parts of the configuration file : \n", "\n", "```\n", "# Copyright (c) 2020 NVIDIA Corporation. All rights reserved.\n", "#\n", "# NVIDIA Corporation and its licensors retain all intellectual property\n", "# and proprietary rights in and to this software, related documentation\n", "# and any modifications thereto. Any use, reproduction, disclosure or\n", "# distribution of this software and related documentation without an express\n", "# license agreement from NVIDIA Corporation is strictly prohibited.\n", "\n", "[property]\n", "gpu-id=0\n", "net-scale-factor=0.0039215697906911373\n", "model-file=/opt/nvidia/deepstream/deepstream-5.0/samples/models/Primary_Detector/resnet10.caffemodel\n", "proto-file=/opt/nvidia/deepstream/deepstream-5.0/samples/models/Primary_Detector/resnet10.prototxt\n", "#model-engine-file=/opt/nvidia/deepstream/deepstream-5.0/samples/models/Primary_Detector/resnet10.caffemodel_b1_gpu0_fp32.engine\n", "labelfile-path=/opt/nvidia/deepstream/deepstream-5.0/samples/models/Primary_Detector/labels.txt\n", "int8-calib-file=/opt/nvidia/deepstream/deepstream-5.0/samples/models/Primary_Detector/cal_trt.bin\n", "force-implicit-batch-dim=1\n", "batch-size=1\n", "network-mode=1\n", "process-mode=1\n", "model-color-format=0\n", "num-detected-classes=4\n", "interval=0\n", "gie-unique-id=1\n", "output-blob-names=conv2d_bbox;conv2d_cov/Sigmoid\n", "\n", "[class-attrs-all]\n", "pre-cluster-threshold=0.2\n", "eps=0.2\n", "group-threshold=1\n", "```\n", "\n", "Here we define all the parameters of our model. In this example we use model-file `resnet10`. `Nvinfer` creates an TensorRT Engine specific to the Host GPU to accelerate it's inference performance." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "############ Set properties for the Elements ############\n", "print(\"Playing file \",INPUT_VIDEO_NAME)\n", "# Set Input File Name \n", "source.set_property('location', INPUT_VIDEO_NAME)\n", "# Set Input Width , Height and Batch Size \n", "streammux.set_property('width', 1920)\n", "streammux.set_property('height', 1080)\n", "streammux.set_property('batch-size', 1)\n", "# Timeout in microseconds to wait after the first buffer is available \n", "# to push the batch even if a complete batch is not formed.\n", "streammux.set_property('batched-push-timeout', 4000000)\n", "# Set Congifuration file for nvinfer \n", "pgie.set_property('config-file-path', \"../source_code/N1/dstest1_pgie_config.txt\")\n", "# Set Encoder bitrate for output video\n", "encoder.set_property(\"bitrate\", 2000000)\n", "# Set Output file name and disable sync and async\n", "sink.set_property(\"location\", OUTPUT_VIDEO_NAME)\n", "sink.set_property(\"sync\", 0)\n", "sink.set_property(\"async\", 0)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We now link all the elements in the order we prefer and create Gstreamer bus to feed all messages through it. " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "########## Add and Link ELements in the Pipeline ########## \n", "\n", "print(\"Adding elements to Pipeline \\n\")\n", "\n", "pipeline.add(source)\n", "pipeline.add(h264parser)\n", "pipeline.add(decoder)\n", "pipeline.add(streammux)\n", "pipeline.add(pgie)\n", "pipeline.add(nvvidconv)\n", "pipeline.add(nvosd)\n", "pipeline.add(queue)\n", "pipeline.add(nvvidconv2)\n", "pipeline.add(encoder)\n", "pipeline.add(codeparser)\n", "pipeline.add(container)\n", "pipeline.add(sink)\n", "\n", "# We now link the elements together \n", "# file-source -> h264-parser -> nvh264-decoder -> nvinfer -> nvvidconv ->\n", "# queue -> nvvidconv2 -> encoder -> parser -> container -> sink -> output-file\n", "print(\"Linking elements in the Pipeline \\n\")\n", "source.link(h264parser)\n", "h264parser.link(decoder)\n", "\n", "##### Creating Sink pad and source pads and linking them together \n", "\n", "# Create Sinkpad to Streammux \n", "sinkpad = streammux.get_request_pad(\"sink_0\")\n", "if not sinkpad:\n", " sys.stderr.write(\" Unable to get the sink pad of streammux \\n\")\n", "# Create source pad from Decoder \n", "srcpad = decoder.get_static_pad(\"src\")\n", "if not srcpad:\n", " sys.stderr.write(\" Unable to get source pad of decoder \\n\")\n", " \n", "srcpad.link(sinkpad)\n", "streammux.link(pgie)\n", "pgie.link(nvvidconv)\n", "nvvidconv.link(nvosd)\n", "nvosd.link(queue)\n", "queue.link(nvvidconv2)\n", "nvvidconv2.link(encoder)\n", "encoder.link(codeparser)\n", "codeparser.link(container)\n", "container.link(sink)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# create an event loop and feed gstreamer bus mesages to it\n", "loop = GObject.MainLoop()\n", "bus = pipeline.get_bus()\n", "bus.add_signal_watch()\n", "bus.connect (\"message\", bus_call, loop)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Working with the Metadata \n", "\n", "Our pipeline now carries the metadata forward but we have not done anything with it until now, but as mentoioned in the above pipeline diagram , we will now create a callback function to write relevant data on the frame once called and create a sink pad in the `nvosd` element to call the function." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "############## Working with the Metadata ################\n", "\n", "def osd_sink_pad_buffer_probe(pad,info,u_data):\n", " \n", " #Intiallizing object counter with 0.\n", " obj_counter = {\n", " PGIE_CLASS_ID_VEHICLE:0,\n", " PGIE_CLASS_ID_PERSON:0,\n", " PGIE_CLASS_ID_BICYCLE:0,\n", " PGIE_CLASS_ID_ROADSIGN:0\n", " }\n", " # Set frame_number & rectangles to draw as 0 \n", " frame_number=0\n", " num_rects=0\n", " \n", " gst_buffer = info.get_buffer()\n", " if not gst_buffer:\n", " print(\"Unable to get GstBuffer \")\n", " return\n", "\n", " # Retrieve batch metadata from the gst_buffer\n", " # Note that pyds.gst_buffer_get_nvds_batch_meta() expects the\n", " # C address of gst_buffer as input, which is obtained with hash(gst_buffer)\n", " batch_meta = pyds.gst_buffer_get_nvds_batch_meta(hash(gst_buffer))\n", " l_frame = batch_meta.frame_meta_list\n", " while l_frame is not None:\n", " try:\n", " # Note that l_frame.data needs a cast to pyds.NvDsFrameMeta\n", " frame_meta = pyds.NvDsFrameMeta.cast(l_frame.data)\n", " except StopIteration:\n", " break\n", " \n", " # Get frame number , number of rectables to draw and object metadata\n", " frame_number=frame_meta.frame_num\n", " num_rects = frame_meta.num_obj_meta\n", " l_obj=frame_meta.obj_meta_list\n", " \n", " while l_obj is not None:\n", " try:\n", " # Casting l_obj.data to pyds.NvDsObjectMeta\n", " obj_meta=pyds.NvDsObjectMeta.cast(l_obj.data)\n", " except StopIteration:\n", " break\n", " # Increment Object class by 1 and Set Box border to Red color \n", " obj_counter[obj_meta.class_id] += 1\n", " obj_meta.rect_params.border_color.set(0.0, 0.0, 1.0, 0.0)\n", " try: \n", " l_obj=l_obj.next\n", " except StopIteration:\n", " break\n", " ################## Setting Metadata Display configruation ############### \n", " # Acquiring a display meta object.\n", " display_meta=pyds.nvds_acquire_display_meta_from_pool(batch_meta)\n", " display_meta.num_labels = 1\n", " py_nvosd_text_params = display_meta.text_params[0]\n", " # Setting display text to be shown on screen\n", " py_nvosd_text_params.display_text = \"Frame Number={} Number of Objects={} Vehicle_count={} Person_count={}\".format(frame_number, num_rects, obj_counter[PGIE_CLASS_ID_VEHICLE], obj_counter[PGIE_CLASS_ID_PERSON])\n", " # Now set the offsets where the string should appear\n", " py_nvosd_text_params.x_offset = 10\n", " py_nvosd_text_params.y_offset = 12\n", " # Font , font-color and font-size\n", " py_nvosd_text_params.font_params.font_name = \"Serif\"\n", " py_nvosd_text_params.font_params.font_size = 10\n", " # Set(red, green, blue, alpha); Set to White\n", " py_nvosd_text_params.font_params.font_color.set(1.0, 1.0, 1.0, 1.0)\n", " # Text background color\n", " py_nvosd_text_params.set_bg_clr = 1\n", " # Set(red, green, blue, alpha); set to Black\n", " py_nvosd_text_params.text_bg_clr.set(0.0, 0.0, 0.0, 1.0)\n", " # Using pyds.get_string() to get display_text as string to print in notebook\n", " print(pyds.get_string(py_nvosd_text_params.display_text))\n", " pyds.nvds_add_display_meta_to_frame(frame_meta, display_meta)\n", " \n", " ############################################################################\n", " \n", " try:\n", " l_frame=l_frame.next\n", " except StopIteration:\n", " break\n", " return Gst.PadProbeReturn.OK" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Lets add probe to get informed of the meta data generated, we add probe to the sink pad \n", "# of the osd element, since by that time, the buffer would have had got all the metadata.\n", "\n", "osdsinkpad = nvosd.get_static_pad(\"sink\")\n", "if not osdsinkpad:\n", " sys.stderr.write(\" Unable to get sink pad of nvosd \\n\")\n", " \n", "osdsinkpad.add_probe(Gst.PadProbeType.BUFFER, osd_sink_pad_buffer_probe, 0)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now with everything defined , we can start the playback and listen the events." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# start play back and listen to events\n", "print(\"Starting pipeline \\n\")\n", "start_time = time.time()\n", "pipeline.set_state(Gst.State.PLAYING)\n", "try:\n", " loop.run()\n", "except:\n", " pass\n", "# cleanup\n", "pipeline.set_state(Gst.State.NULL)\n", "print(\"--- %s seconds ---\" % (time.time() - start_time))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Convert video profile to be compatible with Jupyter notebook\n", "!ffmpeg -loglevel panic -y -an -i ../source_code/N1/ds_out.mp4 -vcodec libx264 -pix_fmt yuv420p -profile:v baseline -level 3 ../source_code/N1/output.mp4" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Display the Output\n", "from IPython.display import HTML\n", "HTML(\"\"\"\n", "