Browse Source

Additional inception fixes

Neal Wu 8 years ago
parent
commit
c539b46db8

+ 13 - 24
inception/README.md

@@ -111,15 +111,12 @@ ready to train or evaluate with the ImageNet data set.
 intensive task and depending on your compute setup may take several days or even
 weeks.
 
-*Before proceeding* please read the [Convolutional Neural Networks]
-(https://www.tensorflow.org/tutorials/deep_cnn/index.html) tutorial in
-particular focus on [Training a Model Using Multiple GPU Cards]
-(https://www.tensorflow.org/tutorials/deep_cnn/index.html#training-a-model-using-multiple-gpu-cards)
-. The model training method is nearly identical to that described in the
+*Before proceeding* please read the [Convolutional Neural Networks](https://www.tensorflow.org/tutorials/deep_cnn/index.html) tutorial; in
+particular, focus on [Training a Model Using Multiple GPU Cards](https://www.tensorflow.org/tutorials/deep_cnn/index.html#launching_and_training_the_model_on_multiple_gpu_cards). The model training method is nearly identical to that described in the
 CIFAR-10 multi-GPU model training. Briefly, the model training
 
-*   Places an individual model replica on each GPU. Split the batch across the
-    GPUs.
+*   Places an individual model replica on each GPU.
+*   Splits the batch across the GPUs.
 *   Updates model parameters synchronously by waiting for all GPUs to finish
     processing a batch of data.
 
@@ -245,11 +242,9 @@ We term each machine that maintains model parameters a `ps`, short for
 `ps` as the model parameters may be sharded across multiple machines.
 
 Variables may be updated with synchronous or asynchronous gradient updates. One
-may construct a an [`Optimizer`]
-(https://www.tensorflow.org/api_docs/python/train.html#optimizers) in TensorFlow
-that constructs the necessary graph for either case diagrammed below from
-TensorFlow [Whitepaper]
-(http://download.tensorflow.org/paper/whitepaper2015.pdf):
+may construct a an [`Optimizer`](https://www.tensorflow.org/api_docs/python/train.html#optimizers) in TensorFlow
+that constructs the necessary graph for either case diagrammed below from the
+TensorFlow [Whitepaper](http://download.tensorflow.org/paper/whitepaper2015.pdf):
 
 <div style="width:40%; margin:auto; margin-bottom:10px; margin-top:20px;">
   <img style="width:100%"
@@ -380,10 +375,8 @@ training Inception in a distributed manner.
 Evaluating an Inception v3 model on the ImageNet 2012 validation data set
 requires running a separate binary.
 
-The evaluation procedure is nearly identical to [Evaluating a Model]
-(https://www.tensorflow.org/tutorials/deep_cnn/index.html#evaluating-a-model)
-described in the [Convolutional Neural Network]
-(https://www.tensorflow.org/tutorials/deep_cnn/index.html) tutorial.
+The evaluation procedure is nearly identical to [Evaluating a Model](https://www.tensorflow.org/tutorials/deep_cnn/index.html#evaluating_a_model)
+described in the [Convolutional Neural Network](https://www.tensorflow.org/tutorials/deep_cnn/index.html) tutorial.
 
 **WARNING** Be careful not to run the evaluation and training binary on the same
 GPU or else you might run out of memory. Consider running the evaluation on a
@@ -438,8 +431,7 @@ daisy, dandelion, roses, sunflowers, tulips
 There is a single automated script that downloads the data set and converts it
 to the TFRecord format. Much like the ImageNet data set, each record in the
 TFRecord format is a serialized `tf.Example` proto whose entries include a
-JPEG-encoded string and an integer label. Please see [`parse_example_proto`]
-(inception/image_processing.py) for details.
+JPEG-encoded string and an integer label. Please see [`parse_example_proto`](inception/image_processing.py) for details.
 
 The script just takes a few minutes to run depending your network connection
 speed for downloading and processing the images. Your hard disk requires 200MB
@@ -471,14 +463,12 @@ and `validation-?????-of-00002`, respectively.
 **NOTE** If you wish to prepare a custom image data set for transfer learning,
 you will need to invoke [`build_image_data.py`](inception/data/build_image_data.py) on
 your custom data set. Please see the associated options and assumptions behind
-this script by reading the comments section of [`build_image_data.py`]
-(inception/data/build_image_data.py). Also, if your custom data has a different
+this script by reading the comments section of [`build_image_data.py`](inception/data/build_image_data.py). Also, if your custom data has a different
 number of examples or classes, you need to change the appropriate values in
 [`imagenet_data.py`](inception/imagenet_data.py).
 
 The second piece you will need is a trained Inception v3 image model. You have
-the option of either training one yourself (See [How to Train from Scratch]
-(#how-to-train-from-scratch) for details) or you can download a pre-trained
+the option of either training one yourself (See [How to Train from Scratch](#how-to-train-from-scratch) for details) or you can download a pre-trained
 model like so:
 
 ```shell
@@ -806,8 +796,7 @@ comments in [`image_processing.py`](inception/image_processing.py) for more deta
 #### The model runs out of CPU memory.
 
 In lieu of buying more CPU memory, an easy fix is to decrease
-`--input_queue_memory_factor`. See [Adjusting Memory Demands]
-(#adjusting-memory-demands).
+`--input_queue_memory_factor`. See [Adjusting Memory Demands](#adjusting-memory-demands).
 
 #### The model runs out of GPU memory.
 

+ 4 - 5
inception/inception/data/build_image_data.py

@@ -32,7 +32,7 @@ a sharded data set consisting of TFRecord files
   train_directory/train-00000-of-01024
   train_directory/train-00001-of-01024
   ...
-  train_directory/train-00127-of-01024
+  train_directory/train-01023-of-01024
 
 and
 
@@ -50,7 +50,7 @@ contains the following fields:
   image/width: integer, image width in pixels
   image/colorspace: string, specifying the colorspace, always 'RGB'
   image/channels: integer, specifying the number of channels, always 3
-  image/format: string, specifying the format, always'JPEG'
+  image/format: string, specifying the format, always 'JPEG'
 
   image/filename: string containing the basename of the image file
             e.g. 'n01440764_10026.JPEG' or 'ILSVRC2012_val_00000293.JPEG'
@@ -60,7 +60,7 @@ contains the following fields:
   image/class/text: string specifying the human-readable version of the label
     e.g. 'dog'
 
-If you data set involves bounding boxes, please look at build_imagenet_data.py.
+If your data set involves bounding boxes, please look at build_imagenet_data.py.
 """
 from __future__ import absolute_import
 from __future__ import division
@@ -72,7 +72,6 @@ import random
 import sys
 import threading
 
-
 import numpy as np
 import tensorflow as tf
 
@@ -306,7 +305,7 @@ def _process_image_files(name, filenames, texts, labels, num_shards):
   spacing = np.linspace(0, len(filenames), FLAGS.num_threads + 1).astype(np.int)
   ranges = []
   for i in range(len(spacing) - 1):
-    ranges.append([spacing[i], spacing[i+1]])
+    ranges.append([spacing[i], spacing[i + 1]])
 
   # Launch a thread for each batch.
   print('Launching %d threads for spacings: %s' % (FLAGS.num_threads, ranges))

+ 4 - 5
inception/inception/data/build_imagenet_data.py

@@ -36,7 +36,7 @@ a sharded data set consisting of 1024 and 128 TFRecord files, respectively.
   train_directory/train-00000-of-01024
   train_directory/train-00001-of-01024
   ...
-  train_directory/train-00127-of-01024
+  train_directory/train-01023-of-01024
 
 and
 
@@ -54,7 +54,7 @@ serialized Example proto. The Example proto contains the following fields:
   image/width: integer, image width in pixels
   image/colorspace: string, specifying the colorspace, always 'RGB'
   image/channels: integer, specifying the number of channels, always 3
-  image/format: string, specifying the format, always'JPEG'
+  image/format: string, specifying the format, always 'JPEG'
 
   image/filename: string containing the basename of the image file
             e.g. 'n01440764_10026.JPEG' or 'ILSVRC2012_val_00000293.JPEG'
@@ -80,7 +80,7 @@ serialized Example proto. The Example proto contains the following fields:
 Note that the length of xmin is identical to the length of xmax, ymin and ymax
 for each example.
 
-Running this script using 16 threads may take around ~2.5 hours on a HP Z420.
+Running this script using 16 threads may take around ~2.5 hours on an HP Z420.
 """
 from __future__ import absolute_import
 from __future__ import division
@@ -92,7 +92,6 @@ import random
 import sys
 import threading
 
-
 import numpy as np
 import tensorflow as tf
 
@@ -435,7 +434,7 @@ def _process_image_files(name, filenames, synsets, labels, humans,
   ranges = []
   threads = []
   for i in range(len(spacing) - 1):
-    ranges.append([spacing[i], spacing[i+1]])
+    ranges.append([spacing[i], spacing[i + 1]])
 
   # Launch a thread for each batch.
   print('Launching %d threads for spacings: %s' % (FLAGS.num_threads, ranges))

+ 1 - 1
inception/inception/data/download_and_preprocess_flowers.sh

@@ -35,7 +35,7 @@
 set -e
 
 if [ -z "$1" ]; then
-  echo "usage download_and_preprocess_flowers.sh [data dir]"
+  echo "Usage: download_and_preprocess_flowers.sh [data dir]"
   exit
 fi
 

+ 1 - 1
inception/inception/data/download_and_preprocess_flowers_mac.sh

@@ -35,7 +35,7 @@
 set -e
 
 if [ -z "$1" ]; then
-  echo "usage download_and_preprocess_flowers.sh [data dir]"
+  echo "Usage: download_and_preprocess_flowers.sh [data dir]"
   exit
 fi
 

+ 2 - 2
inception/inception/data/download_and_preprocess_imagenet.sh

@@ -49,7 +49,7 @@
 set -e
 
 if [ -z "$1" ]; then
-  echo "usage download_and_preprocess_imagenet.sh [data dir]"
+  echo "Usage: download_and_preprocess_imagenet.sh [data dir]"
   exit
 fi
 
@@ -84,7 +84,7 @@ BOUNDING_BOX_FILE="${SCRATCH_DIR}/imagenet_2012_bounding_boxes.csv"
 BOUNDING_BOX_DIR="${SCRATCH_DIR}bounding_boxes/"
 
 "${BOUNDING_BOX_SCRIPT}" "${BOUNDING_BOX_DIR}" "${LABELS_FILE}" \
- | sort >"${BOUNDING_BOX_FILE}"
+ | sort > "${BOUNDING_BOX_FILE}"
 echo "Finished downloading and preprocessing the ImageNet data."
 
 # Build the TFRecords version of the ImageNet data.

+ 1 - 1
inception/inception/data/download_imagenet.sh

@@ -24,7 +24,7 @@
 # downloading the raw images.
 #
 # usage:
-#  ./download_imagenet.sh [dirname]
+#  ./download_imagenet.sh [dir name] [synsets file]
 set -e
 
 if [ "x$IMAGENET_ACCESS_KEY" == x -o "x$IMAGENET_USERNAME" == x ]; then