瀏覽代碼

Additional inception fixes

Neal Wu 8 年之前
父節點
當前提交
c539b46db8

+ 13 - 24
inception/README.md

@@ -111,15 +111,12 @@ ready to train or evaluate with the ImageNet data set.
 intensive task and depending on your compute setup may take several days or even
 intensive task and depending on your compute setup may take several days or even
 weeks.
 weeks.
 
 
-*Before proceeding* please read the [Convolutional Neural Networks]
-(https://www.tensorflow.org/tutorials/deep_cnn/index.html) tutorial in
-particular focus on [Training a Model Using Multiple GPU Cards]
-(https://www.tensorflow.org/tutorials/deep_cnn/index.html#training-a-model-using-multiple-gpu-cards)
-. The model training method is nearly identical to that described in the
+*Before proceeding* please read the [Convolutional Neural Networks](https://www.tensorflow.org/tutorials/deep_cnn/index.html) tutorial; in
+particular, focus on [Training a Model Using Multiple GPU Cards](https://www.tensorflow.org/tutorials/deep_cnn/index.html#launching_and_training_the_model_on_multiple_gpu_cards). The model training method is nearly identical to that described in the
 CIFAR-10 multi-GPU model training. Briefly, the model training
 CIFAR-10 multi-GPU model training. Briefly, the model training
 
 
-*   Places an individual model replica on each GPU. Split the batch across the
-    GPUs.
+*   Places an individual model replica on each GPU.
+*   Splits the batch across the GPUs.
 *   Updates model parameters synchronously by waiting for all GPUs to finish
 *   Updates model parameters synchronously by waiting for all GPUs to finish
     processing a batch of data.
     processing a batch of data.
 
 
@@ -245,11 +242,9 @@ We term each machine that maintains model parameters a `ps`, short for
 `ps` as the model parameters may be sharded across multiple machines.
 `ps` as the model parameters may be sharded across multiple machines.
 
 
 Variables may be updated with synchronous or asynchronous gradient updates. One
 Variables may be updated with synchronous or asynchronous gradient updates. One
-may construct a an [`Optimizer`]
-(https://www.tensorflow.org/api_docs/python/train.html#optimizers) in TensorFlow
-that constructs the necessary graph for either case diagrammed below from
-TensorFlow [Whitepaper]
-(http://download.tensorflow.org/paper/whitepaper2015.pdf):
+may construct a an [`Optimizer`](https://www.tensorflow.org/api_docs/python/train.html#optimizers) in TensorFlow
+that constructs the necessary graph for either case diagrammed below from the
+TensorFlow [Whitepaper](http://download.tensorflow.org/paper/whitepaper2015.pdf):
 
 
 <div style="width:40%; margin:auto; margin-bottom:10px; margin-top:20px;">
 <div style="width:40%; margin:auto; margin-bottom:10px; margin-top:20px;">
   <img style="width:100%"
   <img style="width:100%"
@@ -380,10 +375,8 @@ training Inception in a distributed manner.
 Evaluating an Inception v3 model on the ImageNet 2012 validation data set
 Evaluating an Inception v3 model on the ImageNet 2012 validation data set
 requires running a separate binary.
 requires running a separate binary.
 
 
-The evaluation procedure is nearly identical to [Evaluating a Model]
-(https://www.tensorflow.org/tutorials/deep_cnn/index.html#evaluating-a-model)
-described in the [Convolutional Neural Network]
-(https://www.tensorflow.org/tutorials/deep_cnn/index.html) tutorial.
+The evaluation procedure is nearly identical to [Evaluating a Model](https://www.tensorflow.org/tutorials/deep_cnn/index.html#evaluating_a_model)
+described in the [Convolutional Neural Network](https://www.tensorflow.org/tutorials/deep_cnn/index.html) tutorial.
 
 
 **WARNING** Be careful not to run the evaluation and training binary on the same
 **WARNING** Be careful not to run the evaluation and training binary on the same
 GPU or else you might run out of memory. Consider running the evaluation on a
 GPU or else you might run out of memory. Consider running the evaluation on a
@@ -438,8 +431,7 @@ daisy, dandelion, roses, sunflowers, tulips
 There is a single automated script that downloads the data set and converts it
 There is a single automated script that downloads the data set and converts it
 to the TFRecord format. Much like the ImageNet data set, each record in the
 to the TFRecord format. Much like the ImageNet data set, each record in the
 TFRecord format is a serialized `tf.Example` proto whose entries include a
 TFRecord format is a serialized `tf.Example` proto whose entries include a
-JPEG-encoded string and an integer label. Please see [`parse_example_proto`]
-(inception/image_processing.py) for details.
+JPEG-encoded string and an integer label. Please see [`parse_example_proto`](inception/image_processing.py) for details.
 
 
 The script just takes a few minutes to run depending your network connection
 The script just takes a few minutes to run depending your network connection
 speed for downloading and processing the images. Your hard disk requires 200MB
 speed for downloading and processing the images. Your hard disk requires 200MB
@@ -471,14 +463,12 @@ and `validation-?????-of-00002`, respectively.
 **NOTE** If you wish to prepare a custom image data set for transfer learning,
 **NOTE** If you wish to prepare a custom image data set for transfer learning,
 you will need to invoke [`build_image_data.py`](inception/data/build_image_data.py) on
 you will need to invoke [`build_image_data.py`](inception/data/build_image_data.py) on
 your custom data set. Please see the associated options and assumptions behind
 your custom data set. Please see the associated options and assumptions behind
-this script by reading the comments section of [`build_image_data.py`]
-(inception/data/build_image_data.py). Also, if your custom data has a different
+this script by reading the comments section of [`build_image_data.py`](inception/data/build_image_data.py). Also, if your custom data has a different
 number of examples or classes, you need to change the appropriate values in
 number of examples or classes, you need to change the appropriate values in
 [`imagenet_data.py`](inception/imagenet_data.py).
 [`imagenet_data.py`](inception/imagenet_data.py).
 
 
 The second piece you will need is a trained Inception v3 image model. You have
 The second piece you will need is a trained Inception v3 image model. You have
-the option of either training one yourself (See [How to Train from Scratch]
-(#how-to-train-from-scratch) for details) or you can download a pre-trained
+the option of either training one yourself (See [How to Train from Scratch](#how-to-train-from-scratch) for details) or you can download a pre-trained
 model like so:
 model like so:
 
 
 ```shell
 ```shell
@@ -806,8 +796,7 @@ comments in [`image_processing.py`](inception/image_processing.py) for more deta
 #### The model runs out of CPU memory.
 #### The model runs out of CPU memory.
 
 
 In lieu of buying more CPU memory, an easy fix is to decrease
 In lieu of buying more CPU memory, an easy fix is to decrease
-`--input_queue_memory_factor`. See [Adjusting Memory Demands]
-(#adjusting-memory-demands).
+`--input_queue_memory_factor`. See [Adjusting Memory Demands](#adjusting-memory-demands).
 
 
 #### The model runs out of GPU memory.
 #### The model runs out of GPU memory.
 
 

+ 4 - 5
inception/inception/data/build_image_data.py

@@ -32,7 +32,7 @@ a sharded data set consisting of TFRecord files
   train_directory/train-00000-of-01024
   train_directory/train-00000-of-01024
   train_directory/train-00001-of-01024
   train_directory/train-00001-of-01024
   ...
   ...
-  train_directory/train-00127-of-01024
+  train_directory/train-01023-of-01024
 
 
 and
 and
 
 
@@ -50,7 +50,7 @@ contains the following fields:
   image/width: integer, image width in pixels
   image/width: integer, image width in pixels
   image/colorspace: string, specifying the colorspace, always 'RGB'
   image/colorspace: string, specifying the colorspace, always 'RGB'
   image/channels: integer, specifying the number of channels, always 3
   image/channels: integer, specifying the number of channels, always 3
-  image/format: string, specifying the format, always'JPEG'
+  image/format: string, specifying the format, always 'JPEG'
 
 
   image/filename: string containing the basename of the image file
   image/filename: string containing the basename of the image file
             e.g. 'n01440764_10026.JPEG' or 'ILSVRC2012_val_00000293.JPEG'
             e.g. 'n01440764_10026.JPEG' or 'ILSVRC2012_val_00000293.JPEG'
@@ -60,7 +60,7 @@ contains the following fields:
   image/class/text: string specifying the human-readable version of the label
   image/class/text: string specifying the human-readable version of the label
     e.g. 'dog'
     e.g. 'dog'
 
 
-If you data set involves bounding boxes, please look at build_imagenet_data.py.
+If your data set involves bounding boxes, please look at build_imagenet_data.py.
 """
 """
 from __future__ import absolute_import
 from __future__ import absolute_import
 from __future__ import division
 from __future__ import division
@@ -72,7 +72,6 @@ import random
 import sys
 import sys
 import threading
 import threading
 
 
-
 import numpy as np
 import numpy as np
 import tensorflow as tf
 import tensorflow as tf
 
 
@@ -306,7 +305,7 @@ def _process_image_files(name, filenames, texts, labels, num_shards):
   spacing = np.linspace(0, len(filenames), FLAGS.num_threads + 1).astype(np.int)
   spacing = np.linspace(0, len(filenames), FLAGS.num_threads + 1).astype(np.int)
   ranges = []
   ranges = []
   for i in range(len(spacing) - 1):
   for i in range(len(spacing) - 1):
-    ranges.append([spacing[i], spacing[i+1]])
+    ranges.append([spacing[i], spacing[i + 1]])
 
 
   # Launch a thread for each batch.
   # Launch a thread for each batch.
   print('Launching %d threads for spacings: %s' % (FLAGS.num_threads, ranges))
   print('Launching %d threads for spacings: %s' % (FLAGS.num_threads, ranges))

+ 4 - 5
inception/inception/data/build_imagenet_data.py

@@ -36,7 +36,7 @@ a sharded data set consisting of 1024 and 128 TFRecord files, respectively.
   train_directory/train-00000-of-01024
   train_directory/train-00000-of-01024
   train_directory/train-00001-of-01024
   train_directory/train-00001-of-01024
   ...
   ...
-  train_directory/train-00127-of-01024
+  train_directory/train-01023-of-01024
 
 
 and
 and
 
 
@@ -54,7 +54,7 @@ serialized Example proto. The Example proto contains the following fields:
   image/width: integer, image width in pixels
   image/width: integer, image width in pixels
   image/colorspace: string, specifying the colorspace, always 'RGB'
   image/colorspace: string, specifying the colorspace, always 'RGB'
   image/channels: integer, specifying the number of channels, always 3
   image/channels: integer, specifying the number of channels, always 3
-  image/format: string, specifying the format, always'JPEG'
+  image/format: string, specifying the format, always 'JPEG'
 
 
   image/filename: string containing the basename of the image file
   image/filename: string containing the basename of the image file
             e.g. 'n01440764_10026.JPEG' or 'ILSVRC2012_val_00000293.JPEG'
             e.g. 'n01440764_10026.JPEG' or 'ILSVRC2012_val_00000293.JPEG'
@@ -80,7 +80,7 @@ serialized Example proto. The Example proto contains the following fields:
 Note that the length of xmin is identical to the length of xmax, ymin and ymax
 Note that the length of xmin is identical to the length of xmax, ymin and ymax
 for each example.
 for each example.
 
 
-Running this script using 16 threads may take around ~2.5 hours on a HP Z420.
+Running this script using 16 threads may take around ~2.5 hours on an HP Z420.
 """
 """
 from __future__ import absolute_import
 from __future__ import absolute_import
 from __future__ import division
 from __future__ import division
@@ -92,7 +92,6 @@ import random
 import sys
 import sys
 import threading
 import threading
 
 
-
 import numpy as np
 import numpy as np
 import tensorflow as tf
 import tensorflow as tf
 
 
@@ -435,7 +434,7 @@ def _process_image_files(name, filenames, synsets, labels, humans,
   ranges = []
   ranges = []
   threads = []
   threads = []
   for i in range(len(spacing) - 1):
   for i in range(len(spacing) - 1):
-    ranges.append([spacing[i], spacing[i+1]])
+    ranges.append([spacing[i], spacing[i + 1]])
 
 
   # Launch a thread for each batch.
   # Launch a thread for each batch.
   print('Launching %d threads for spacings: %s' % (FLAGS.num_threads, ranges))
   print('Launching %d threads for spacings: %s' % (FLAGS.num_threads, ranges))

+ 1 - 1
inception/inception/data/download_and_preprocess_flowers.sh

@@ -35,7 +35,7 @@
 set -e
 set -e
 
 
 if [ -z "$1" ]; then
 if [ -z "$1" ]; then
-  echo "usage download_and_preprocess_flowers.sh [data dir]"
+  echo "Usage: download_and_preprocess_flowers.sh [data dir]"
   exit
   exit
 fi
 fi
 
 

+ 1 - 1
inception/inception/data/download_and_preprocess_flowers_mac.sh

@@ -35,7 +35,7 @@
 set -e
 set -e
 
 
 if [ -z "$1" ]; then
 if [ -z "$1" ]; then
-  echo "usage download_and_preprocess_flowers.sh [data dir]"
+  echo "Usage: download_and_preprocess_flowers.sh [data dir]"
   exit
   exit
 fi
 fi
 
 

+ 2 - 2
inception/inception/data/download_and_preprocess_imagenet.sh

@@ -49,7 +49,7 @@
 set -e
 set -e
 
 
 if [ -z "$1" ]; then
 if [ -z "$1" ]; then
-  echo "usage download_and_preprocess_imagenet.sh [data dir]"
+  echo "Usage: download_and_preprocess_imagenet.sh [data dir]"
   exit
   exit
 fi
 fi
 
 
@@ -84,7 +84,7 @@ BOUNDING_BOX_FILE="${SCRATCH_DIR}/imagenet_2012_bounding_boxes.csv"
 BOUNDING_BOX_DIR="${SCRATCH_DIR}bounding_boxes/"
 BOUNDING_BOX_DIR="${SCRATCH_DIR}bounding_boxes/"
 
 
 "${BOUNDING_BOX_SCRIPT}" "${BOUNDING_BOX_DIR}" "${LABELS_FILE}" \
 "${BOUNDING_BOX_SCRIPT}" "${BOUNDING_BOX_DIR}" "${LABELS_FILE}" \
- | sort >"${BOUNDING_BOX_FILE}"
+ | sort > "${BOUNDING_BOX_FILE}"
 echo "Finished downloading and preprocessing the ImageNet data."
 echo "Finished downloading and preprocessing the ImageNet data."
 
 
 # Build the TFRecords version of the ImageNet data.
 # Build the TFRecords version of the ImageNet data.

+ 1 - 1
inception/inception/data/download_imagenet.sh

@@ -24,7 +24,7 @@
 # downloading the raw images.
 # downloading the raw images.
 #
 #
 # usage:
 # usage:
-#  ./download_imagenet.sh [dirname]
+#  ./download_imagenet.sh [dir name] [synsets file]
 set -e
 set -e
 
 
 if [ "x$IMAGENET_ACCESS_KEY" == x -o "x$IMAGENET_USERNAME" == x ]; then
 if [ "x$IMAGENET_ACCESS_KEY" == x -o "x$IMAGENET_USERNAME" == x ]; then