|
7 years ago | |
---|---|---|
data | 7 years ago | |
examples | 7 years ago | |
scripts | 7 years ago | |
src | 7 years ago | |
third_party | 7 years ago | |
.gitmodules | 7 years ago | |
CMakeLists.txt | 7 years ago | |
INSTALL.md | 7 years ago | |
LICENSE.md | 7 years ago | |
README.md | 7 years ago |
This contains examples, scripts and code related to image classification using TensorFlow models (from here) converted to TensorRT. Converting TensorFlow models to TensorRT offers significant performance gains on the Jetson TX2 as seen below.
The table below shows various details related to pretrained models ported from the TensorFlow slim model zoo.
Model | Input Size | TensorRT (TX2 / Half) | TensorRT (TX2 / Float) | TensorFlow (TX2 / Float) | Input Name | Output Name | Preprocessing Fn. |
---|---|---|---|---|---|---|---|
inception_v1 | 224x224 | 7.98ms | 12.8ms | 27.6ms | input | InceptionV1/Logits/SpatialSqueeze | inception |
inception_v3 | 299x299 | 26.3ms | 46.1ms | 98.4ms | input | InceptionV3/Logits/SpatialSqueeze | inception |
inception_v4 | 299x299 | 52.1ms | 88.2ms | 176ms | input | InceptionV4/Logits/Logits/BiasAdd | inception |
inception_resnet_v2 | 299x299 | 53.0ms | 98.7ms | 168ms | input | InceptionResnetV2/Logits/Logits/BiasAdd | inception |
resnet_v1_50 | 224x224 | 15.7ms | 27.1ms | 63.9ms | input | resnet_v1_50/SpatialSqueeze | vgg |
resnet_v1_101 | 224x224 | 29.9ms | 51.8ms | 107ms | input | resnet_v1_101/SpatialSqueeze | vgg |
resnet_v1_152 | 224x224 | 42.6ms | 78.2ms | 157ms | input | resnet_v1_152/SpatialSqueeze | vgg |
resnet_v2_50 | 299x299 | 27.5ms | 44.4ms | 92.2ms | input | resnet_v2_50/SpatialSqueeze | inception |
resnet_v2_101 | 299x299 | 49.2ms | 83.1ms | 160ms | input | resnet_v2_101/SpatialSqueeze | inception |
resnet_v2_152 | 299x299 | 74.6ms | 124ms | 230ms | input | resnet_v2_152/SpatialSqueeze | inception |
mobilenet_v1_0p25_128 | 128x128 | 2.67ms | 2.65ms | 15.7ms | input | MobilenetV1/Logits/SpatialSqueeze | inception |
mobilenet_v1_0p5_160 | 160x160 | 3.95ms | 4.00ms | 16.9ms | input | MobilenetV1/Logits/SpatialSqueeze | inception |
mobilenet_v1_1p0_224 | 224x224 | 12.9ms | 12.9ms | 24.4ms | input | MobilenetV1/Logits/SpatialSqueeze | inception |
vgg_16 | 224x224 | 38.2ms | 79.2ms | 171ms | input | vgg_16/fc8/BiasAdd | vgg |
The times recorded include data transfer to GPU, network execution, and data transfer back from GPU. Time does not include preprocessing. See scripts/test_tf.py, scripts/test_trt.py, and src/test/test_trt.cu for implementation details. To reproduce the timings run
python scripts/test_tf.py
python scripts/test_trt.py
The timing results will be located in data/test_output_tf.txt and data/test_output_trt.txt. Note that you must download and convert the models (as in the quick start) prior to running the benchmark scripts.
Run the following bash script to download all of the pretrained models.
source scripts/download_models.sh
If there are any models you don't want to use, simply remove their URL and name from the model lists in scripts/download_models.sh.
Next, because the TensorFlow models are provided in checkpoint format, we must convert them to frozen graphs for optimization with TensorRT. Run the scripts/models_to_frozen_graphs.py script.
python scripts/models_to_frozen_graphs.py
If you removed any models in the previous step, you must add 'exclude': true
to the corresponding item in the NETS
dictionary located in scripts/model_meta.py.
./build/examples/classify_image/classify_image data/images/gordon_setter.jpg data/plans/inception_v1.plan data/imagenet_labels_1001.txt input InceptionV1/Logits/SpatialSqueeze inception
python scripts/frozen_graphs_to_plans.py