9 лет назад · 4157e58ec0
--- a/street/README.md
+++ b/street/README.md
@@ -0,0 +1,234 @@
 
				+# StreetView Tensorflow Recurrent End-to-End Transcription (STREET) Model.
			
 
				+
			
 
				+A TensorFlow implementation of the STREET model described in the paper:
			
 
				+
			
 
				+"End-to-End Interpretation of the French Street Name Signs Dataset"
			
 
				+
			
 
				+Raymond Smith, Chunhui Gu, Dar-Shyang Lee, Huiyi Hu, Ranjith
			
 
				+Unnikrishnan, Julian Ibarz, Sacha Arnoud, Sophia Lin.
			
 
				+
			
 
				+*International Workshop on Robust Reading, Amsterdam, 9 October 2016.*
			
 
				+
			
 
				+Available at: http://link.springer.com/chapter/10.1007%2F978-3-319-46604-0_30
			
 
				+
			
 
				+
			
 
				+## Contact
			
 
				+***Author:*** Ray Smith (rays@google.com).
			
 
				+
			
 
				+***Pull requests and issues:*** @theraysmith.
			
 
				+
			
 
				+## Contents
			
 
				+* [Introduction](#introduction)
			
 
				+* [Installing and setting up the STREET model](#installing-and-setting-up-the-street-model)
			
 
				+* [Downloading the datasets](#downloading-the-datasets)
			
 
				+* [Confidence Tests](#confidence-tests)
			
 
				+* [Training a model](#training-a-model)
			
 
				+* [The Variable Graph Specification Language](#the-variable-graph-specification-language)
			
 
				+
			
 
				+## Introduction
			
 
				+
			
 
				+The *STREET* model is a deep recurrent neural network that learns how to
			
 
				+identify the name of a street (in France) from an image containing upto four
			
 
				+different views of the street name sign. The model merges information from the
			
 
				+different views and normalizes the text to the correct format. For example:
			
 
				+
			
 
				+<center>
			
 
				+![Example image](g3doc/avdessapins.png)
			
 
				+
			
 
				+Avenue des Sapins
			
 
				+</center>
			
 
				+
			
 
				+
			
 
				+## Installing and setting up the STREET model
			
 
				+[Install Tensorflow](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/g3doc/get_started/os_setup.md#virtualenv-installation)
			
 
				+
			
 
				+Install numpy:
			
 
				+
			
 
				+```
			
 
				+sudo pip install numpy
			
 
				+```
			
 
				+
			
 
				+Build the LSTM op:
			
 
				+
			
 
				+```
			
 
				+cd cc
			
 
				+TF_INC=$(python -c 'import tensorflow as tf; print(tf.sysconfig.get_include())')
			
 
				+g++ -std=c++11 -shared rnn_ops.cc -o rnn_ops.so -fPIC -I $TF_INC -O3 -mavx
			
 
				+```
			
 
				+
			
 
				+Run the unittests:
			
 
				+
			
 
				+```
			
 
				+cd ../python
			
 
				+python decoder_test.py
			
 
				+python errorcounter_test.py
			
 
				+python shapes_test.py
			
 
				+python vgslspecs_test.py
			
 
				+python vgsl_model_test.py
			
 
				+```
			
 
				+
			
 
				+## Downloading the datasets
			
 
				+
			
 
				+The French Street Name Signs (FSNS) datasets can be downloaded from:
			
 
				+`https://download.tensorflow.org/data/fsns-20160927`
			
 
				+Note that these datasets are very large. The approximate sizes are:
			
 
				+
			
 
				+*   Train: 512 files of 300MB each.
			
 
				+*   Validation: 64 files of 40MB each.
			
 
				+*   Test: 64 file of 50MB each.
			
 
				+*   Testdata: some smaller data files of a few MB for testing.
			
 
				+
			
 
				+## Confidence Tests
			
 
				+
			
 
				+The datasets download includes a directory `testdata` that contains some small
			
 
				+datasets that are big enough to test that models can actually learn something.
			
 
				+Assuming that you have put the downloads in directory `data` alongside
			
 
				+`python` then you can run the following tests:
			
 
				+
			
 
				+### Mnist for zero-dimensional data
			
 
				+
			
 
				+```
			
 
				+cd python
			
 
				+train_dir=/tmp/mnist
			
 
				+rm -rf $train_dir
			
 
				+python vgsl_train.py --model_str='16,0,0,1[Ct5,5,16 Mp3,3 Lfys32 Lfxs64]O0s12' \
			
 
				+  --max_steps=1024 --train_data=../data/testdata/mnist-sample-00000-of-00001 \
			
 
				+  --initial_learning_rate=0.001 --final_learning_rate=0.001 \
			
 
				+  --num_preprocess_threads=1 --train_dir=$train_dir
			
 
				+python vgsl_eval.py --model_str='16,0,0,1[Ct5,5,16 Mp3,3 Lfys32 Lfxs64]O0s12' \
			
 
				+  --num_steps=256 --eval_data=../data/testdata/mnist-sample-00000-of-00001 \
			
 
				+  --num_preprocess_threads=1 --decoder=../testdata/numbers.charset_size=12.txt \
			
 
				+  --eval_interval_secs=0 --train_dir=$train_dir --eval_dir=$train_dir/eval
			
 
				+```
			
 
				+
			
 
				+Depending on your machine, this should run in about 1 minute, and should obtain
			
 
				+error rates below 50%. Actual error rates will vary according to random
			
 
				+initialization.
			
 
				+
			
 
				+### Fixed-length targets for number recognition
			
 
				+
			
 
				+```
			
 
				+cd python
			
 
				+train_dir=/tmp/fixed
			
 
				+rm -rf $train_dir
			
 
				+python vgsl_train.py --model_str='8,16,0,1[S1(1x16)1,3 Lfx32 Lrx32 Lfx32]O1s12' \
			
 
				+  --max_steps=3072 --train_data=../data/testdata/numbers-16-00000-of-00001 \
			
 
				+  --initial_learning_rate=0.001 --final_learning_rate=0.001 \
			
 
				+  --num_preprocess_threads=1 --train_dir=$train_dir
			
 
				+python vgsl_eval.py --model_str='8,16,0,1[S1(1x16)1,3 Lfx32 Lrx32 Lfx32]O1s12' \
			
 
				+  --num_steps=256 --eval_data=../data/testdata/numbers-16-00000-of-00001 \
			
 
				+  --num_preprocess_threads=1 --decoder=../testdata/numbers.charset_size=12.txt \
			
 
				+  --eval_interval_secs=0 --train_dir=$train_dir --eval_dir=$train_dir/eval
			
 
				+```
			
 
				+
			
 
				+Depending on your machine, this should run in about 1-2 minutes, and should
			
 
				+obtain a label error rate between 50 and 80%, with word error rates probably
			
 
				+not coming below 100%. Actual error rates will vary
			
 
				+according to random initialization.
			
 
				+
			
 
				+### OCR-style data with CTC
			
 
				+
			
 
				+```
			
 
				+cd python
			
 
				+train_dir=/tmp/ctc
			
 
				+rm -rf $train_dir
			
 
				+python vgsl_train.py --model_str='1,32,0,1[S1(1x32)1,3 Lbx100]O1c105' \
			
 
				+  --max_steps=4096 --train_data=../data/testdata/arial-32-00000-of-00001 \
			
 
				+  --initial_learning_rate=0.001 --final_learning_rate=0.001 \
			
 
				+  --num_preprocess_threads=1 --train_dir=$train_dir &
			
 
				+python vgsl_eval.py --model_str='1,32,0,1[S1(1x32)1,3 Lbx100]O1c105' \
			
 
				+  --num_steps=256 --eval_data=../data/testdata/arial-32-00000-of-00001 \
			
 
				+  --num_preprocess_threads=1 --decoder=../testdata/arial.charset_size=105.txt \
			
 
				+  --eval_interval_secs=15 --train_dir=$train_dir --eval_dir=$train_dir/eval &
			
 
				+tensorboard --logdir=$train_dir
			
 
				+```
			
 
				+
			
 
				+Depending on your machine, the background training should run for about 3-4
			
 
				+minutes, and should obtain a label error rate between 10 and 50%, with
			
 
				+correspondingly higher word error rates and even higher sequence error rate.
			
 
				+Actual error rates will vary according to random initialization.
			
 
				+The background eval will run for ever, and will have to be terminated by hand.
			
 
				+The tensorboard command will run a visualizer that can be viewed with a
			
 
				+browser. Go to the link that it prints to view tensorboard and see the
			
 
				+training progress. See the [Tensorboard](https://www.tensorflow.org/versions/r0.10/how_tos/summaries_and_tensorboard/index.html)
			
 
				+introduction for more information.
			
 
				+
			
 
				+
			
 
				+### Mini FSNS dataset
			
 
				+
			
 
				+You can test the actual STREET model on a small FSNS data set. The model will
			
 
				+overfit to this small dataset, but will give some confidence that everything
			
 
				+is working correctly. *Note* that this test runs the training and evaluation
			
 
				+in parallel, which is something that you should do when training any substantial
			
 
				+system, so you can monitor progress.
			
 
				+
			
 
				+
			
 
				+```
			
 
				+cd python
			
 
				+train_dir=/tmp/fsns
			
 
				+rm -rf $train_dir
			
 
				+python vgsl_train.py --max_steps=10000 --num_preprocess_threads=1 \
			
 
				+  --train_data=../data/testdata/fsns-00000-of-00001 \
			
 
				+  --initial_learning_rate=0.0001 --final_learning_rate=0.0001 \
			
 
				+  --train_dir=$train_dir &
			
 
				+python vgsl_eval.py --num_steps=256 --num_preprocess_threads=1 \
			
 
				+   --eval_data=../data/testdata/fsns-00000-of-00001 \
			
 
				+   --decoder=../testdata/charset_size=134.txt \
			
 
				+   --eval_interval_secs=300 --train_dir=$train_dir --eval_dir=$train_dir/eval &
			
 
				+tensorboard --logdir=$train_dir
			
 
				+```
			
 
				+
			
 
				+Depending on your machine, the training should finish in about 1-2 *hours*.
			
 
				+As with the CTC testset above, the eval and tensorboard will have to be
			
 
				+terminated manually.
			
 
				+
			
 
				+## Training a full FSNS model
			
 
				+
			
 
				+After running the tests above, you are ready to train the real thing!
			
 
				+*Note* that you might want to use a train_dir somewhere other than /tmp as
			
 
				+you can stop the training, reboot if needed and continue if you keep the
			
 
				+data intact, but /tmp gets deleted on a reboot.
			
 
				+
			
 
				+```
			
 
				+cd python
			
 
				+train_dir=/tmp/fsns
			
 
				+rm -rf $train_dir
			
 
				+python vgsl_train.py --max_steps=100000000 --train_data=../data/train/train* \
			
 
				+  --train_dir=$train_dir &
			
 
				+python vgsl_eval.py --num_steps=1000 \
			
 
				+  --eval_data=../data/validation/validation* \
			
 
				+  --decoder=../testdata/charset_size=134.txt \
			
 
				+  --eval_interval_secs=300 --train_dir=$train_dir --eval_dir=$train_dir/eval &
			
 
				+tensorboard --logdir=$train_dir
			
 
				+```
			
 
				+
			
 
				+Training will take a very long time (probably many weeks) to reach minimum
			
 
				+error rate on a single machine, although it will probably take substatially
			
 
				+fewer iterations than with parallel training. Faster training can be obtained
			
 
				+with parallel training on a cluster.
			
 
				+Since the setup is likely to be very site-specific, please see the TensorFlow
			
 
				+documentation on
			
 
				+[Distributed TensorFlow](https://www.tensorflow.org/versions/r0.10/how_tos/distributed/index.html)
			
 
				+for more information. Some code changes may be needed in the `Train` function
			
 
				+in `vgsl_model.py`.
			
 
				+
			
 
				+With 40 parallel training workers, nearly optimal error rates (about 25%
			
 
				+sequence error on the validation set) are obtained in about 30 million steps,
			
 
				+although the error continues to fall slightly over the next 30 million, to
			
 
				+perhaps as low as 23%.
			
 
				+
			
 
				+With a single machine the number of steps could be substantially lower.
			
 
				+Although untested on this problem, on other problems the ratio is typically
			
 
				+5 to 1 so low error rates could be obtained as soon as 6 million iterations,
			
 
				+which could be reached in about 4 weeks.
			
 
				+
			
 
				+
			
 
				+## The Variable Graph Specification Language
			
 
				+
			
 
				+The STREET model makes use of a graph specification language (VGSL) that
			
 
				+enables rapid experimentation with different model architectures. The language
			
 
				+defines a Tensor Flow graph that can be used to process images of variable sizes
			
 
				+to output a 1-dimensional sequence, like a transcription/OCR problem, or a
			
 
				+0-dimensional label, as for image identification problems. For more information
			
 
				+see [vgslspecs](g3doc/vgslspecs.md)
			
 
				+
			
--- a/street/cc/rnn_ops.cc
+++ b/street/cc/rnn_ops.cc
@@ -0,0 +1,538 @@
 
				+/* Copyright 2016 The TensorFlow Authors. All Rights Reserved.
			
 
				+
			
 
				+Licensed under the Apache License, Version 2.0 (the "License");
			
 
				+you may not use this file except in compliance with the License.
			
 
				+You may obtain a copy of the License at
			
 
				+
			
 
				+    http://www.apache.org/licenses/LICENSE-2.0
			
 
				+
			
 
				+Unless required by applicable law or agreed to in writing, software
			
 
				+distributed under the License is distributed on an "AS IS" BASIS,
			
 
				+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
			
 
				+See the License for the specific language governing permissions and
			
 
				+limitations under the License.
			
 
				+==============================================================================*/
			
 
				+
			
 
				+// OpKernel of LSTM Neural Networks:
			
 
				+//
			
 
				+//   LSTM: VariableLSTMOp (VariableLSTMGradOp)
			
 
				+//
			
 
				+// where (.*) are the ops to compute gradients for the corresponding ops.
			
 
				+
			
 
				+#define EIGEN_USE_THREADS
			
 
				+
			
 
				+#include <vector>
			
 
				+#ifdef GOOGLE_INCLUDES
			
 
				+#include "third_party/eigen3/Eigen/Core"
			
 
				+#include "third_party/tensorflow/core/framework/op.h"
			
 
				+#include "third_party/tensorflow/core/framework/op_kernel.h"
			
 
				+#include "third_party/tensorflow/core/framework/tensor.h"
			
 
				+#else
			
 
				+#include "Eigen/Core"
			
 
				+#include "tensorflow/core/framework/op.h"
			
 
				+#include "tensorflow/core/framework/op_kernel.h"
			
 
				+#include "tensorflow/core/framework/tensor.h"
			
 
				+#endif  // GOOGLE_INCLUDES
			
 
				+
			
 
				+namespace tensorflow {
			
 
				+
			
 
				+using Eigen::array;
			
 
				+using Eigen::DenseIndex;
			
 
				+using IndexPair = Eigen::IndexPair<int>;
			
 
				+
			
 
				+Status AreDimsEqual(int dim1, int dim2, const string& message) {
			
 
				+  if (dim1 != dim2) {
			
 
				+    return errors::InvalidArgument(message, ": ", dim1, " vs. ", dim2);
			
 
				+  }
			
 
				+  return Status::OK();
			
 
				+}
			
 
				+
			
 
				+// ------------------------------- VariableLSTMOp -----------------------------
			
 
				+
			
 
				+// Kernel to compute the forward propagation of a Long Short-Term Memory
			
 
				+// network. See the doc of the op below for more detail.
			
 
				+class VariableLSTMOp : public OpKernel {
			
 
				+ public:
			
 
				+  explicit VariableLSTMOp(OpKernelConstruction* ctx) : OpKernel(ctx) {
			
 
				+    OP_REQUIRES_OK(ctx, ctx->GetAttr("clip", &clip_));
			
 
				+    OP_REQUIRES(
			
 
				+        ctx, clip_ >= 0.0,
			
 
				+        errors::InvalidArgument("clip_ needs to be equal or greator than 0"));
			
 
				+  }
			
 
				+
			
 
				+  void Compute(OpKernelContext* ctx) override {
			
 
				+    // Inputs.
			
 
				+    const auto input = ctx->input(0).tensor<float, 4>();
			
 
				+    const auto initial_state = ctx->input(1).tensor<float, 2>();
			
 
				+    const auto initial_memory = ctx->input(2).tensor<float, 2>();
			
 
				+    const auto w_m_m = ctx->input(3).tensor<float, 3>();
			
 
				+    const int batch_size = input.dimension(0);
			
 
				+    const int seq_len = input.dimension(1);
			
 
				+    const int output_dim = input.dimension(3);
			
 
				+
			
 
				+    // Sanity checks.
			
 
				+    OP_REQUIRES_OK(ctx, AreDimsEqual(4, input.dimension(2), "Input num"));
			
 
				+    OP_REQUIRES_OK(ctx, AreDimsEqual(batch_size, initial_state.dimension(0),
			
 
				+                                     "State batch"));
			
 
				+    OP_REQUIRES_OK(
			
 
				+        ctx, AreDimsEqual(output_dim, initial_state.dimension(1), "State dim"));
			
 
				+    OP_REQUIRES_OK(ctx, AreDimsEqual(batch_size, initial_memory.dimension(0),
			
 
				+                                     "Memory batch"));
			
 
				+    OP_REQUIRES_OK(ctx, AreDimsEqual(output_dim, initial_memory.dimension(1),
			
 
				+                                     "Memory dim"));
			
 
				+    OP_REQUIRES_OK(
			
 
				+        ctx, AreDimsEqual(output_dim, w_m_m.dimension(0), "Weight dim 0"));
			
 
				+    OP_REQUIRES_OK(ctx, AreDimsEqual(4, w_m_m.dimension(1), "Weight dim 1"));
			
 
				+    OP_REQUIRES_OK(
			
 
				+        ctx, AreDimsEqual(output_dim, w_m_m.dimension(2), "Weight dim 2"));
			
 
				+
			
 
				+    // Outputs.
			
 
				+    Tensor* act_tensor = nullptr;
			
 
				+    OP_REQUIRES_OK(ctx, ctx->allocate_output(
			
 
				+                            0, {batch_size, seq_len, output_dim}, &act_tensor));
			
 
				+    auto act = act_tensor->tensor<float, 3>();
			
 
				+    act.setZero();
			
 
				+
			
 
				+    Tensor* gate_raw_act_tensor = nullptr;
			
 
				+    OP_REQUIRES_OK(ctx,
			
 
				+                   ctx->allocate_output(1, {batch_size, seq_len, 4, output_dim},
			
 
				+                                        &gate_raw_act_tensor));
			
 
				+    auto gate_raw_act = gate_raw_act_tensor->tensor<float, 4>();
			
 
				+    gate_raw_act.setZero();
			
 
				+
			
 
				+    Tensor* memory_tensor = nullptr;
			
 
				+    OP_REQUIRES_OK(ctx,
			
 
				+                   ctx->allocate_output(2, {batch_size, seq_len, output_dim},
			
 
				+                                        &memory_tensor));
			
 
				+    auto memory = memory_tensor->tensor<float, 3>();
			
 
				+    memory.setZero();
			
 
				+
			
 
				+    // Const and scratch tensors.
			
 
				+    Tensor ones_tensor;
			
 
				+    OP_REQUIRES_OK(ctx, ctx->allocate_temp(DT_FLOAT, {batch_size, output_dim},
			
 
				+                                           &ones_tensor));
			
 
				+    auto ones = ones_tensor.tensor<float, 2>();
			
 
				+    ones.setConstant(1.0);
			
 
				+
			
 
				+    Tensor state_tensor;
			
 
				+    OP_REQUIRES_OK(ctx, ctx->allocate_temp(DT_FLOAT, {batch_size, output_dim},
			
 
				+                                           &state_tensor));
			
 
				+    auto state = state_tensor.tensor<float, 2>();
			
 
				+    state = initial_state;
			
 
				+
			
 
				+    Tensor scratch_tensor;
			
 
				+    OP_REQUIRES_OK(ctx,
			
 
				+                   ctx->allocate_temp(DT_FLOAT, {batch_size, 4, output_dim},
			
 
				+                                      &scratch_tensor));
			
 
				+    auto scratch = scratch_tensor.tensor<float, 3>();
			
 
				+    scratch.setZero();
			
 
				+
			
 
				+    // Uses the most efficient order for the contraction depending on the batch
			
 
				+    // size.
			
 
				+
			
 
				+    // This is the code shared by both cases. It is discouraged to use the
			
 
				+    // implicit capture with lambda functions, but it should be clear that what
			
 
				+    // is done here.
			
 
				+    auto Forward = [&](int i) {
			
 
				+      // Each pre-activation value is stored in the following order (See the
			
 
				+      // comment of the op for the meaning):
			
 
				+      //
			
 
				+      //   i: 0
			
 
				+      //   j: 1
			
 
				+      //   f: 2
			
 
				+      //   o: 3
			
 
				+
			
 
				+      // Adds one to the pre-activation values of the forget gate. This is a
			
 
				+      // heuristic to make the training easier.
			
 
				+      scratch.chip(2, 1) += ones;
			
 
				+
			
 
				+      gate_raw_act.chip(i, 1) = scratch;
			
 
				+
			
 
				+      // c_t = f_t * c_{t-1} + i_t * j_t
			
 
				+      if (i == 0) {
			
 
				+        state = initial_memory * scratch.chip(2, 1).sigmoid();
			
 
				+      } else {
			
 
				+        state = memory.chip(i - 1, 1) * scratch.chip(2, 1).sigmoid();
			
 
				+      }
			
 
				+      state += scratch.chip(0, 1).sigmoid() * scratch.chip(1, 1).tanh();
			
 
				+
			
 
				+      if (clip_ > 0.0) {
			
 
				+        // Clips the values if required.
			
 
				+        state = state.cwiseMax(-clip_).cwiseMin(clip_);
			
 
				+      }
			
 
				+
			
 
				+      memory.chip(i, 1) = state;
			
 
				+
			
 
				+      // h_t = o_t * tanh(c_t)
			
 
				+      state = scratch.chip(3, 1).sigmoid() * state.tanh();
			
 
				+
			
 
				+      act.chip(i, 1) = state;
			
 
				+    };
			
 
				+    if (batch_size == 1) {
			
 
				+      // Reshapes the weight tensor to pretend as if it is a matrix
			
 
				+      // multiplication which is more efficient.
			
 
				+      auto w_m_m_r =
			
 
				+          w_m_m.reshape(array<DenseIndex, 2>{output_dim, 4 * output_dim});
			
 
				+      // Dimensions for the contraction.
			
 
				+      const array<IndexPair, 1> m_m_dim = {IndexPair(1, 0)};
			
 
				+      for (int i = 0; i < seq_len; ++i) {
			
 
				+        // Computes the pre-activation value of the input and each gate.
			
 
				+        scratch = input.chip(i, 1) +
			
 
				+                  state.contract(w_m_m_r, m_m_dim)
			
 
				+                      .reshape(array<DenseIndex, 3>{batch_size, 4, output_dim});
			
 
				+        Forward(i);
			
 
				+      }
			
 
				+    } else {
			
 
				+      // Shuffles the dimensions of the weight tensor to be efficient when used
			
 
				+      // in the left-hand side. Allocates memory for the shuffled tensor for
			
 
				+      // efficiency.
			
 
				+      Tensor w_m_m_s_tensor;
			
 
				+      OP_REQUIRES_OK(ctx,
			
 
				+                     ctx->allocate_temp(DT_FLOAT, {output_dim * 4, output_dim},
			
 
				+                                        &w_m_m_s_tensor));
			
 
				+      auto w_m_m_s = w_m_m_s_tensor.tensor<float, 2>();
			
 
				+      w_m_m_s = w_m_m.shuffle(array<int, 3>{2, 1, 0})
			
 
				+                    .reshape(array<DenseIndex, 2>{output_dim * 4, output_dim});
			
 
				+      // Dimensions for the contraction.
			
 
				+      const array<IndexPair, 1> m_m_dim = {IndexPair(1, 1)};
			
 
				+      for (int i = 0; i < seq_len; ++i) {
			
 
				+        // Computes the pre-activation value of the input and each gate.
			
 
				+        scratch = input.chip(i, 1) +
			
 
				+                  w_m_m_s.contract(state, m_m_dim)
			
 
				+                      .reshape(array<DenseIndex, 3>{output_dim, 4, batch_size})
			
 
				+                      .shuffle(array<int, 3>{2, 1, 0});
			
 
				+        Forward(i);
			
 
				+      }
			
 
				+    }
			
 
				+  }
			
 
				+
			
 
				+ private:
			
 
				+  // Threshold to clip the values of memory cells.
			
 
				+  float clip_ = 0;
			
 
				+};
			
 
				+
			
 
				+REGISTER_KERNEL_BUILDER(Name("VariableLSTM").Device(DEVICE_CPU),
			
 
				+                        VariableLSTMOp);
			
 
				+REGISTER_OP("VariableLSTM")
			
 
				+    .Attr("clip: float = 0.0")
			
 
				+    .Input("input: float32")
			
 
				+    .Input("initial_state: float32")
			
 
				+    .Input("initial_memory: float32")
			
 
				+    .Input("w_m_m: float32")
			
 
				+    .Output("activation: float32")
			
 
				+    .Output("gate_raw_act: float32")
			
 
				+    .Output("memory: float32")
			
 
				+    .Doc(R"doc(
			
 
				+Computes the forward propagation of a Long Short-Term Memory Network.
			
 
				+
			
 
				+It computes the following equation recursively for `0<t<=T`:
			
 
				+
			
 
				+  i_t  = sigmoid(a_{i,t})
			
 
				+  j_t  = tanh(a_{j,t})
			
 
				+  f_t  = sigmoid(a_{f,t} + 1.0)
			
 
				+  o_t  = sigmoid(a_{o,t})
			
 
				+  c_t  = f_t * c_{t-1} + i_t * j_t
			
 
				+  c'_t = min(max(c_t, -clip), clip) if clip > 0 else c_t
			
 
				+  h_t  = o_t * tanh(c'_t)
			
 
				+
			
 
				+where
			
 
				+
			
 
				+  a_{l,t} = w_{l,m,m} * h_{t-1} + x'_{l,t}
			
 
				+
			
 
				+where
			
 
				+
			
 
				+  x'_{l,t} = w_{l,m,i} * x_{t}.
			
 
				+
			
 
				+`input` corresponds to the concatenation of `X'_i`, `X'_j`, `X'_f`, and `X'_o`
			
 
				+where `X'_l = (x'_{l,1}, x'_{l,2}, ..., x'_{l,T})`, `initial_state` corresponds
			
 
				+to `h_{0}`, `initial_memory` corresponds to `c_{0}` and `weight` corresponds to
			
 
				+`w_{l,m,m}`. `X'_l` (the transformed input) is computed outside of the op in
			
 
				+advance, so w_{l,m,i} is not passed in to the op.
			
 
				+
			
 
				+`activation` corresponds to `H = (h_1, h_2, ..., h_T)`, `gate_raw_activation`
			
 
				+corresponds to the concatanation of `A_i`, `A_j`, `A_f` and `A_o`, and `memory`
			
 
				+corresponds `C = (c_0, c_1, ..., c_T)`.
			
 
				+
			
 
				+All entries in the batch are propagated to the end, and are assumed to be the
			
 
				+same length.
			
 
				+
			
 
				+input: 4-D with shape `[batch_size, seq_len, 4, num_nodes]`
			
 
				+initial_state: 2-D with shape `[batch_size, num_nodes]`
			
 
				+initial_memory: 2-D with shape `[batch_size, num_nodes]`
			
 
				+w_m_m: 3-D with shape `[num_nodes, 4, num_nodes]`
			
 
				+activation: 3-D with shape `[batch_size, seq_len, num_nodes]`
			
 
				+gate_raw_act: 3-D with shape `[batch_size, seq_len, 4, num_nodes]`
			
 
				+memory: 3-D with shape `[batch_size, seq_len, num_nodes]`
			
 
				+)doc");
			
 
				+
			
 
				+// ----------------------------- VariableLSTMGradOp ----------------------------
			
 
				+
			
 
				+// Kernel to compute the gradient of VariableLSTMOp.
			
 
				+class VariableLSTMGradOp : public OpKernel {
			
 
				+ public:
			
 
				+  explicit VariableLSTMGradOp(OpKernelConstruction* ctx) : OpKernel(ctx) {}
			
 
				+
			
 
				+  void Compute(OpKernelContext* ctx) override {
			
 
				+    // Inputs.
			
 
				+    const auto initial_state = ctx->input(0).tensor<float, 2>();
			
 
				+    const auto initial_memory = ctx->input(1).tensor<float, 2>();
			
 
				+    const auto w_m_m = ctx->input(2).tensor<float, 3>();
			
 
				+    const auto act = ctx->input(3).tensor<float, 3>();
			
 
				+    const auto gate_raw_act = ctx->input(4).tensor<float, 4>();
			
 
				+    const auto memory = ctx->input(5).tensor<float, 3>();
			
 
				+    const auto act_grad = ctx->input(6).tensor<float, 3>();
			
 
				+    const auto gate_raw_act_grad = ctx->input(7).tensor<float, 4>();
			
 
				+    const auto memory_grad = ctx->input(8).tensor<float, 3>();
			
 
				+    const int batch_size = act.dimension(0);
			
 
				+    const int seq_len = act.dimension(1);
			
 
				+    const int output_dim = act.dimension(2);
			
 
				+
			
 
				+    // Sanity checks.
			
 
				+    OP_REQUIRES_OK(ctx, AreDimsEqual(batch_size, initial_state.dimension(0),
			
 
				+                                     "State batch"));
			
 
				+    OP_REQUIRES_OK(
			
 
				+        ctx, AreDimsEqual(output_dim, initial_state.dimension(1), "State dim"));
			
 
				+    OP_REQUIRES_OK(ctx, AreDimsEqual(batch_size, initial_memory.dimension(0),
			
 
				+                                     "Memory batch"));
			
 
				+    OP_REQUIRES_OK(ctx, AreDimsEqual(output_dim, initial_memory.dimension(1),
			
 
				+                                     "Memory dim"));
			
 
				+    OP_REQUIRES_OK(
			
 
				+        ctx, AreDimsEqual(output_dim, w_m_m.dimension(0), "Weight dim 0"));
			
 
				+    OP_REQUIRES_OK(ctx, AreDimsEqual(4, w_m_m.dimension(1), "Weight dim 1"));
			
 
				+    OP_REQUIRES_OK(
			
 
				+        ctx, AreDimsEqual(output_dim, w_m_m.dimension(2), "Weight dim 2"));
			
 
				+    OP_REQUIRES_OK(ctx, AreDimsEqual(batch_size, gate_raw_act.dimension(0),
			
 
				+                                     "Gate raw activation batch"));
			
 
				+    OP_REQUIRES_OK(ctx, AreDimsEqual(seq_len, gate_raw_act.dimension(1),
			
 
				+                                     "Gate raw activation  len"));
			
 
				+    OP_REQUIRES_OK(ctx, AreDimsEqual(4, gate_raw_act.dimension(2),
			
 
				+                                     "Gate raw activation num"));
			
 
				+    OP_REQUIRES_OK(ctx, AreDimsEqual(output_dim, gate_raw_act.dimension(3),
			
 
				+                                     "Gate raw activation dim"));
			
 
				+    OP_REQUIRES_OK(
			
 
				+        ctx, AreDimsEqual(batch_size, memory.dimension(0), "Memory batch"));
			
 
				+    OP_REQUIRES_OK(ctx,
			
 
				+                   AreDimsEqual(seq_len, memory.dimension(1), "Memory len"));
			
 
				+    OP_REQUIRES_OK(ctx,
			
 
				+                   AreDimsEqual(output_dim, memory.dimension(2), "Memory dim"));
			
 
				+    OP_REQUIRES_OK(ctx, AreDimsEqual(batch_size, act_grad.dimension(0),
			
 
				+                                     "Activation gradient batch"));
			
 
				+    OP_REQUIRES_OK(ctx, AreDimsEqual(seq_len, act_grad.dimension(1),
			
 
				+                                     "Activation gradient len"));
			
 
				+    OP_REQUIRES_OK(ctx, AreDimsEqual(output_dim, act_grad.dimension(2),
			
 
				+                                     "Activation gradient dim"));
			
 
				+    OP_REQUIRES_OK(ctx, AreDimsEqual(batch_size, gate_raw_act_grad.dimension(0),
			
 
				+                                     "Activation gradient batch"));
			
 
				+    OP_REQUIRES_OK(ctx, AreDimsEqual(seq_len, gate_raw_act_grad.dimension(1),
			
 
				+                                     "Activation gradient len"));
			
 
				+    OP_REQUIRES_OK(ctx, AreDimsEqual(4, gate_raw_act_grad.dimension(2),
			
 
				+                                     "Activation gradient num"));
			
 
				+    OP_REQUIRES_OK(ctx, AreDimsEqual(output_dim, gate_raw_act_grad.dimension(3),
			
 
				+                                     "Activation gradient dim"));
			
 
				+    OP_REQUIRES_OK(ctx, AreDimsEqual(batch_size, memory_grad.dimension(0),
			
 
				+                                     "Memory gradient batch"));
			
 
				+    OP_REQUIRES_OK(ctx, AreDimsEqual(seq_len, memory_grad.dimension(1),
			
 
				+                                     "Memory gradient len"));
			
 
				+    OP_REQUIRES_OK(ctx, AreDimsEqual(output_dim, memory_grad.dimension(2),
			
 
				+                                     "Memory gradient dim"));
			
 
				+
			
 
				+    // Outputs.
			
 
				+    std::vector<Tensor*> collections(4, nullptr);
			
 
				+    OP_REQUIRES_OK(ctx,
			
 
				+                   ctx->allocate_output(0, {batch_size, seq_len, 4, output_dim},
			
 
				+                                        &collections[0]));
			
 
				+    auto input_grad = collections[0]->tensor<float, 4>();
			
 
				+    input_grad.setZero();
			
 
				+
			
 
				+    OP_REQUIRES_OK(ctx, ctx->allocate_output(1, {batch_size, output_dim},
			
 
				+                                             &collections[1]));
			
 
				+    auto init_state_grad = collections[1]->tensor<float, 2>();
			
 
				+    init_state_grad.setZero();
			
 
				+
			
 
				+    OP_REQUIRES_OK(ctx, ctx->allocate_output(2, {batch_size, output_dim},
			
 
				+                                             &collections[2]));
			
 
				+    auto init_memory_grad = collections[2]->tensor<float, 2>();
			
 
				+    init_memory_grad.setZero();
			
 
				+
			
 
				+    OP_REQUIRES_OK(ctx, ctx->allocate_output(3, {output_dim, 4, output_dim},
			
 
				+                                             &collections[3]));
			
 
				+    auto w_m_m_grad = collections[3]->tensor<float, 3>();
			
 
				+    w_m_m_grad.setZero();
			
 
				+
			
 
				+    // Const and scratch tensors.
			
 
				+    Tensor ones_tensor;
			
 
				+    OP_REQUIRES_OK(ctx, ctx->allocate_temp(DT_FLOAT, {batch_size, output_dim},
			
 
				+                                           &ones_tensor));
			
 
				+    auto ones = ones_tensor.tensor<float, 2>();
			
 
				+    ones.setConstant(1.0);
			
 
				+
			
 
				+    Tensor scratch_tensor;
			
 
				+    OP_REQUIRES_OK(ctx,
			
 
				+                   ctx->allocate_temp(DT_FLOAT, {batch_size, 4, output_dim},
			
 
				+                                      &scratch_tensor));
			
 
				+    auto scratch = scratch_tensor.tensor<float, 3>();
			
 
				+    scratch.setZero();
			
 
				+
			
 
				+    Tensor tmp1_tensor;
			
 
				+    OP_REQUIRES_OK(ctx, ctx->allocate_temp(DT_FLOAT, {batch_size, output_dim},
			
 
				+                                           &tmp1_tensor));
			
 
				+    auto tmp1 = tmp1_tensor.tensor<float, 2>();
			
 
				+    tmp1.setZero();
			
 
				+
			
 
				+    Tensor tmp2_tensor;
			
 
				+    OP_REQUIRES_OK(ctx, ctx->allocate_temp(DT_FLOAT, {batch_size, output_dim},
			
 
				+                                           &tmp2_tensor));
			
 
				+    auto tmp2 = tmp2_tensor.tensor<float, 2>();
			
 
				+    tmp2.setZero();
			
 
				+
			
 
				+    // Uses the most efficient order for the contraction depending on the batch
			
 
				+    // size.
			
 
				+
			
 
				+    // Shuffles the dimensions of the weight tensor to be efficient when used in
			
 
				+    // the left-hand side. Allocates memory for the shuffled tensor for
			
 
				+    // efficiency.
			
 
				+    Tensor w_m_m_s_tensor;
			
 
				+    OP_REQUIRES_OK(ctx,
			
 
				+                   ctx->allocate_temp(DT_FLOAT, {4, output_dim, output_dim},
			
 
				+                                      &w_m_m_s_tensor));
			
 
				+    auto w_m_m_s = w_m_m_s_tensor.tensor<float, 3>();
			
 
				+    if (batch_size == 1) {
			
 
				+      // Allocates memory only it is used.
			
 
				+      w_m_m_s = w_m_m.shuffle(array<int, 3>{1, 2, 0});
			
 
				+    }
			
 
				+
			
 
				+    // Dimensions for the contraction with the weight tensor.
			
 
				+    const array<IndexPair, 1> m_m_dim =
			
 
				+        batch_size == 1 ? array<IndexPair, 1>{IndexPair(1, 0)}
			
 
				+                        : array<IndexPair, 1>{IndexPair(1, 1)};
			
 
				+    // Dimensions for the contraction of the batch dimensions.
			
 
				+    const array<IndexPair, 1> b_b_dim = {IndexPair(0, 0)};
			
 
				+    for (int i = seq_len - 1; i >= 0; --i) {
			
 
				+      if (i == seq_len - 1) {
			
 
				+        init_state_grad = act_grad.chip(i, 1);
			
 
				+      } else {
			
 
				+        w_m_m_grad +=
			
 
				+            act.chip(i, 1)
			
 
				+                .contract(scratch.reshape(
			
 
				+                              array<DenseIndex, 2>{batch_size, 4 * output_dim}),
			
 
				+                          b_b_dim)
			
 
				+                .reshape(array<DenseIndex, 3>{output_dim, 4, output_dim});
			
 
				+        if (batch_size == 1) {
			
 
				+          init_state_grad.device(ctx->eigen_cpu_device()) =
			
 
				+              scratch.chip(0, 1).contract(w_m_m_s.chip(0, 0), m_m_dim) +
			
 
				+              scratch.chip(1, 1).contract(w_m_m_s.chip(1, 0), m_m_dim) +
			
 
				+              scratch.chip(2, 1).contract(w_m_m_s.chip(2, 0), m_m_dim) +
			
 
				+              scratch.chip(3, 1).contract(w_m_m_s.chip(3, 0), m_m_dim);
			
 
				+        } else {
			
 
				+          init_state_grad.device(ctx->eigen_cpu_device()) =
			
 
				+              (w_m_m.chip(0, 1).contract(scratch.chip(0, 1), m_m_dim) +
			
 
				+               w_m_m.chip(1, 1).contract(scratch.chip(1, 1), m_m_dim) +
			
 
				+               w_m_m.chip(2, 1).contract(scratch.chip(2, 1), m_m_dim) +
			
 
				+               w_m_m.chip(3, 1).contract(scratch.chip(3, 1), m_m_dim))
			
 
				+                  .shuffle(array<int, 2>{1, 0});
			
 
				+        }
			
 
				+        init_state_grad += act_grad.chip(i, 1);
			
 
				+      }
			
 
				+
			
 
				+      auto gate_raw_act_t = gate_raw_act.chip(i, 1);
			
 
				+      auto gate_raw_act_grad_t = gate_raw_act_grad.chip(i, 1);
			
 
				+
			
 
				+      // Output gate.
			
 
				+      tmp1 = memory.chip(i, 1);
			
 
				+      tmp1 = tmp1.tanh();                          // y_t
			
 
				+      tmp2 = gate_raw_act_t.chip(3, 1).sigmoid();  // o_t
			
 
				+      scratch.chip(3, 1) = init_state_grad * tmp1 * tmp2 * (ones - tmp2) +
			
 
				+                           gate_raw_act_grad_t.chip(3, 1);
			
 
				+
			
 
				+      init_memory_grad += init_state_grad * tmp2 * (ones - tmp1.square()) +
			
 
				+                          memory_grad.chip(i, 1);
			
 
				+
			
 
				+      // Input gate.
			
 
				+      tmp1 = gate_raw_act_t.chip(0, 1).sigmoid();  // i_t
			
 
				+      tmp2 = gate_raw_act_t.chip(1, 1);
			
 
				+      tmp2 = tmp2.tanh();  // j_t
			
 
				+      scratch.chip(0, 1) = init_memory_grad * tmp2 * tmp1 * (ones - tmp1) +
			
 
				+                           gate_raw_act_grad_t.chip(0, 1);
			
 
				+
			
 
				+      // Input.
			
 
				+      scratch.chip(1, 1) = init_memory_grad * tmp1 * (ones - tmp2.square()) +
			
 
				+                           gate_raw_act_grad_t.chip(1, 1);
			
 
				+
			
 
				+      // Forget gate.
			
 
				+      tmp1 = gate_raw_act_t.chip(2, 1).sigmoid();  // f_t
			
 
				+      if (i == 0) {
			
 
				+        scratch.chip(2, 1) =
			
 
				+            init_memory_grad * initial_memory * tmp1 * (ones - tmp1) +
			
 
				+            gate_raw_act_grad_t.chip(2, 1);
			
 
				+      } else {
			
 
				+        scratch.chip(2, 1) =
			
 
				+            init_memory_grad * memory.chip(i - 1, 1) * tmp1 * (ones - tmp1) +
			
 
				+            gate_raw_act_grad_t.chip(2, 1);
			
 
				+      }
			
 
				+
			
 
				+      // Memory.
			
 
				+      init_memory_grad *= tmp1;
			
 
				+
			
 
				+      input_grad.chip(i, 1) = scratch;
			
 
				+    }
			
 
				+    w_m_m_grad += initial_state
			
 
				+                      .contract(scratch.reshape(array<DenseIndex, 2>{
			
 
				+                                    batch_size, 4 * output_dim}),
			
 
				+                                b_b_dim)
			
 
				+                      .reshape(array<DenseIndex, 3>{output_dim, 4, output_dim});
			
 
				+    if (batch_size == 1) {
			
 
				+      init_state_grad.device(ctx->eigen_cpu_device()) =
			
 
				+          (scratch.chip(0, 1).contract(w_m_m_s.chip(0, 0), m_m_dim) +
			
 
				+           scratch.chip(1, 1).contract(w_m_m_s.chip(1, 0), m_m_dim) +
			
 
				+           scratch.chip(2, 1).contract(w_m_m_s.chip(2, 0), m_m_dim) +
			
 
				+           scratch.chip(3, 1).contract(w_m_m_s.chip(3, 0), m_m_dim));
			
 
				+    } else {
			
 
				+      init_state_grad.device(ctx->eigen_cpu_device()) =
			
 
				+          (w_m_m.chip(0, 1).contract(scratch.chip(0, 1), m_m_dim) +
			
 
				+           w_m_m.chip(1, 1).contract(scratch.chip(1, 1), m_m_dim) +
			
 
				+           w_m_m.chip(2, 1).contract(scratch.chip(2, 1), m_m_dim) +
			
 
				+           w_m_m.chip(3, 1).contract(scratch.chip(3, 1), m_m_dim))
			
 
				+              .shuffle(array<int, 2>{1, 0});
			
 
				+    }
			
 
				+  }
			
 
				+};
			
 
				+
			
 
				+REGISTER_KERNEL_BUILDER(Name("VariableLSTMGrad").Device(DEVICE_CPU),
			
 
				+                        VariableLSTMGradOp);
			
 
				+
			
 
				+REGISTER_OP("VariableLSTMGrad")
			
 
				+    .Input("initial_state: float32")
			
 
				+    .Input("initial_memory: float32")
			
 
				+    .Input("w_m_m: float32")
			
 
				+    .Input("activation: float32")
			
 
				+    .Input("gate_raw_act: float32")
			
 
				+    .Input("memory: float32")
			
 
				+    .Input("act_grad: float32")
			
 
				+    .Input("gate_raw_act_grad: float32")
			
 
				+    .Input("memory_grad: float32")
			
 
				+    .Output("input_grad: float32")
			
 
				+    .Output("initial_state_grad: float32")
			
 
				+    .Output("initial_memory_grad: float32")
			
 
				+    .Output("w_m_m_grad: float32")
			
 
				+    .Doc(R"doc(
			
 
				+Computes the gradient for VariableLSTM.
			
 
				+
			
 
				+This is to be used conjunction with VariableLSTM. It ignores the clipping used
			
 
				+in the forward pass.
			
 
				+
			
 
				+initial_state: 2-D with shape `[batch_size, num_nodes]`
			
 
				+initial_memory: 2-D with shape `[batch_size, num_nodes]`
			
 
				+w_m_m: 3-D with shape `[num_nodes, 4, num_nodes]`
			
 
				+activation: 3-D with shape `[batch_size, seq_len, num_nodes]`
			
 
				+gate_raw_act: 3-D with shape `[batch_size, seq_len, 4, num_nodes]`
			
 
				+memory: 3-D with shape `[batch_size, seq_len, num_nodes]`
			
 
				+act_grad: 3-D with shape `[batch_size, seq_len, num_nodes]`
			
 
				+gate_raw_act_grad: 3-D with shape `[batch_size, seq_len, 4, num_nodes]`
			
 
				+memory_grad: 3-D with shape `[batch_size, seq_len, num_nodes]`
			
 
				+input_grad: 3-D with shape `[batch_size, seq_len, num_nodes]`
			
 
				+initial_state_grad: 2-D with shape `[batch_size, num_nodes]`
			
 
				+initial_memory_grad: 2-D with shape `[batch_size, num_nodes]`
			
 
				+w_m_m_grad: 3-D with shape `[num_nodes, 4, num_nodes]`
			
 
				+)doc");
			
 
				+
			
 
				+}  // namespace tensorflow
			
--- a/street/g3doc/avdessapins.png
+++ b/street/g3doc/avdessapins.png
--- a/street/g3doc/vgslspecs.md
+++ b/street/g3doc/vgslspecs.md
@@ -0,0 +1,324 @@
 
				+# VGSL Specs - rapid prototyping of mixed conv/LSTM networks for images.
			
 
				+
			
 
				+Variable-size Graph Specification Language (VGSL) enables the specification of a
			
 
				+Tensor Flow graph, composed of convolutions and LSTMs, that can process
			
 
				+variable-sized images, from a very short definition string.
			
 
				+
			
 
				+## Applications: What is VGSL Specs good for?
			
 
				+
			
 
				+VGSL Specs are designed specifically to create TF graphs for:
			
 
				+
			
 
				+*   Variable size images as the input. (In one or BOTH dimensions!)
			
 
				+*   Output an image (heat map), sequence (like text), or a category.
			
 
				+*   Convolutions and LSTMs are the main computing component.
			
 
				+*   Fixed-size images are OK too!
			
 
				+
			
 
				+But wait, aren't there other systems that simplify generating TF graphs? There
			
 
				+are indeed, but something they all have in common is that they are designed for
			
 
				+fixed size images only. If you want to solve a real OCR problem, you either have
			
 
				+to cut the image into arbitrary sized pieces and try to stitch the results back
			
 
				+together, or use VGSL.
			
 
				+
			
 
				+## Basic Usage
			
 
				+
			
 
				+A full model, including input and the output layers, can be built using
			
 
				+vgsl_model.py. Alternatively you can supply your own tensors and add your own
			
 
				+loss function layer if you wish, using vgslspecs.py directly.
			
 
				+
			
 
				+### Building a full model
			
 
				+
			
 
				+Provided your problem matches the one addressed by vgsl_model, you are good to
			
 
				+go.
			
 
				+
			
 
				+Targeted problems:
			
 
				+
			
 
				+*   Images for input, either 8 bit greyscale or 24 bit color.
			
 
				+*   Output is 0-d (A category, like cat, dog, train, car.)
			
 
				+*   Output is 1-d, with either variable length or a fixed length sequence, eg
			
 
				+    OCR, transcription problems in general.
			
 
				+
			
 
				+Currently only softmax (1 of n) outputs are supported, but it would not be
			
 
				+difficult to extend to logistic.
			
 
				+
			
 
				+Use vgsl_train.py to train your model, and vgsl_eval.py to evaluate it. They
			
 
				+just call Train and Eval in vgsl_model.py.
			
 
				+
			
 
				+### Model string for a full model
			
 
				+
			
 
				+The model string for a full model includes the input spec, the output spec and
			
 
				+the layers spec in between. Example:
			
 
				+
			
 
				+```
			
 
				+'1,0,0,3[Ct5,5,16 Mp3,3 Lfys64 Lfx128 Lrx128 Lfx256]O1c105'
			
 
				+```
			
 
				+
			
 
				+The first 4 numbers specify the standard TF tensor dimensions: [batch, height,
			
 
				+width, depth], except that height and/or width may be zero, allowing them to be
			
 
				+variable. Batch is specific only to training, and may be a different value at
			
 
				+recognition/inference time. Depth needs to be 1 for greyscale and 3 for color.
			
 
				+
			
 
				+The model string in square brackets [] is the main model definition, which is
			
 
				+described [below.](#basic-layers-syntax) The output specification takes the
			
 
				+form:
			
 
				+
			
 
				+```
			
 
				+O(2|1|0)(l|s|c)n output layer with n classes.
			
 
				+  2 (heatmap) Output is a 2-d vector map of the input (possibly at
			
 
				+    different scale). (Not yet supported.)
			
 
				+  1 (sequence) Output is a 1-d sequence of vector values.
			
 
				+  0 (category) Output is a 0-d single vector value.
			
 
				+  l uses a logistic non-linearity on the output, allowing multiple
			
 
				+    hot elements in any output vector value. (Not yet supported.)
			
 
				+  s uses a softmax non-linearity, with one-hot output in each value.
			
 
				+  c uses a softmax with CTC. Can only be used with s (sequence).
			
 
				+  NOTE Only O0s, O1s and O1c are currently supported.
			
 
				+```
			
 
				+
			
 
				+The number of classes must match the encoding of the TF Example data set.
			
 
				+
			
 
				+### Layers only - providing your own input and loss layers
			
 
				+
			
 
				+You don't have to use the canned input/output modules, if you provide your
			
 
				+separate code to read TF Example and loss functions. First prepare your inputs:
			
 
				+
			
 
				+*   A TF-conventional batch of: `images = tf.float32[batch, height, width,
			
 
				+    depth]`
			
 
				+*   A tensor of the width of each image in the batch: `widths = tf.int64[batch]`
			
 
				+*   A tensor of the height of each image in the batch: `heights =
			
 
				+    tf.int64[batch]`
			
 
				+
			
 
				+Note that these can be created from individual images using
			
 
				+`tf.train.batch_join` with `dynamic_pad=True.`
			
 
				+
			
 
				+```python
			
 
				+import vgslspecs
			
 
				+...
			
 
				+spec = '[Ct5,5,16 Mp3,3 Lfys64 Lfx128 Lrx128 Lfx256]'
			
 
				+vgsl = vgslspecs.VGSLSpecs(widths, heights, is_training=True)
			
 
				+last_layer = vgsl.Build(images, spec)
			
 
				+...
			
 
				+AddSomeLossFunction(last_layer)....
			
 
				+```
			
 
				+
			
 
				+With some appropriate training data, this would create a world-class OCR engine!
			
 
				+
			
 
				+## Basic Layers Syntax
			
 
				+
			
 
				+NOTE that *all* ops input and output the standard TF convention of a 4-d tensor:
			
 
				+`[batch, height, width, depth]` *regardless of any collapsing of dimensions.*
			
 
				+This greatly simplifies things, and allows the VGSLSpecs class to track changes
			
 
				+to the values of widths and heights, so they can be correctly passed in to LSTM
			
 
				+operations, and used by any downstream CTC operation.
			
 
				+
			
 
				+NOTE: in the descriptions below, `<d>` is a numeric value, and literals are
			
 
				+described using regular expression syntax.
			
 
				+
			
 
				+NOTE: Whitespace is allowed between ops.
			
 
				+
			
 
				+### Naming
			
 
				+
			
 
				+Each op gets a unique name by default, based on its spec string plus its
			
 
				+character position in the overall specification. All the Ops take an optional
			
 
				+name argument in braces after the mnemonic code, but before any numeric
			
 
				+arguments.
			
 
				+
			
 
				+### Functional ops
			
 
				+
			
 
				+```
			
 
				+C(s|t|r|l|m)[{name}]<y>,<x>,<d> Convolves using a y,x window, with no shrinkage,
			
 
				+  SAME infill, d outputs, with s|t|r|l|m non-linear layer.
			
 
				+F(s|t|r|l|m)[{name}]<d> Fully-connected with s|t|r|l|m non-linearity and d
			
 
				+  outputs. Reduces height, width to 1. Input height and width must be constant.
			
 
				+L(f|r|b)(x|y)[s][{name}]<n> LSTM cell with n outputs.
			
 
				+  The LSTM must have one of:
			
 
				+    f runs the LSTM forward only.
			
 
				+    r runs the LSTM reversed only.
			
 
				+    b runs the LSTM bidirectionally.
			
 
				+  It will operate on either the x- or y-dimension, treating the other dimension
			
 
				+  independently (as if part of the batch).
			
 
				+  (Full 2-d and grid are not yet supported).
			
 
				+  s (optional) summarizes the output in the requested dimension,
			
 
				+     outputting only the final step, collapsing the dimension to a
			
 
				+     single element.
			
 
				+Do[{name}] Insert a dropout layer.
			
 
				+```
			
 
				+
			
 
				+In the above, `(s|t|r|l|m)` specifies the type of the non-linearity:
			
 
				+
			
 
				+```python
			
 
				+s = sigmoid
			
 
				+t = tanh
			
 
				+r = relu
			
 
				+l = linear (i.e., None)
			
 
				+m = softmax
			
 
				+```
			
 
				+
			
 
				+Examples:
			
 
				+
			
 
				+`Cr5,5,32` Runs a 5x5 Relu convolution with 32 depth/number of filters.
			
 
				+
			
 
				+`Lfx{MyLSTM}128` runs a forward-only LSTM, named 'MyLSTM' in the x-dimension
			
 
				+with 128 outputs, treating the y dimension independently.
			
 
				+
			
 
				+`Lfys64` runs a forward-only LSTM in the y-dimension with 64 outputs, treating
			
 
				+the x-dimension independently and collapses the y-dimension to 1 element.
			
 
				+
			
 
				+### Plumbing ops
			
 
				+
			
 
				+The plumbing ops allow the construction of arbitrarily complex graphs. Something
			
 
				+currently missing is the ability to define macros for generating say an
			
 
				+inception unit in multiple places.
			
 
				+
			
 
				+```
			
 
				+[...] Execute ... networks in series (layers).
			
 
				+(...) Execute ... networks in parallel, with their output concatenated in depth.
			
 
				+S[{name}]<d>(<a>x<b>)<e>,<f> Splits one dimension, moves one part to another
			
 
				+  dimension.
			
 
				+Mp[{name}]<y>,<x> Maxpool the input, reducing the (y,x) rectangle to a single
			
 
				+  value.
			
 
				+```
			
 
				+
			
 
				+In the `S` op, `<a>, <b>, <d>, <e>, <f>` are numbers.
			
 
				+
			
 
				+`S` is a generalized reshape. It splits input dimension `d` into `a` x `b`,
			
 
				+sending the high/most significant part `a` to the high/most significant side of
			
 
				+dimension `e`, and the low part `b` to the high side of dimension `f`.
			
 
				+Exception: if `d=e=f`, then then dimension `d` is internally transposed to
			
 
				+`bxa`. *At least one* of `e`, `f` must be equal to `d`, so no dimension can be
			
 
				+totally destroyed. Either `a` or `b` can be zero, meaning whatever is left after
			
 
				+taking out the other, allowing dimensions to be of variable size.
			
 
				+
			
 
				+NOTE: Remember the standard TF convention of a 4-d tensor: `[batch, height,
			
 
				+width, depth]`, so `batch=0, height=1, width=2, depth=3.`
			
 
				+
			
 
				+Eg. `S3(3x50)2,3` will split the 150-element depth into 3x50, with the 3 going
			
 
				+to the most significant part of the width, and the 50 part staying in depth.
			
 
				+This will rearrange a 3x50 output parallel operation to spread the 3 output sets
			
 
				+over width.
			
 
				+
			
 
				+### Full Examples
			
 
				+
			
 
				+Example 1: A graph capable of high quality OCR.
			
 
				+
			
 
				+`1,0,0,1[Ct5,5,16 Mp3,3 Lfys64 Lfx128 Lrx128 Lfx256]O1c105`
			
 
				+
			
 
				+As layer descriptions: (Input layer is at the bottom, output at the top.)
			
 
				+
			
 
				+```
			
 
				+O1c105: Output layer produces 1-d (sequence) output, trained with CTC,
			
 
				+  outputting 105 classes.
			
 
				+Lfx256: Forward-only LSTM in x with 256 outputs
			
 
				+Lrx128: Reverse-only LSTM in x with 128 outputs
			
 
				+Lfx128: Forward-only LSTM in x with 128 outputs
			
 
				+Lfys64: Dimension-summarizing LSTM, summarizing the y-dimension with 64 outputs
			
 
				+Mp3,3: 3 x 3 Maxpool
			
 
				+Ct5,5,16: 5 x 5 Convolution with 16 outputs and tanh non-linearity
			
 
				+[]: The body of the graph is alway expressed as a series of layers.
			
 
				+1,0,0,1: Input is a batch of 1 image of variable size in greyscale
			
 
				+```
			
 
				+
			
 
				+Example 2: The STREET network for reading French street name signs end-to-end.
			
 
				+For a detailed description see the [FSNS dataset
			
 
				+paper](http://link.springer.com/chapter/10.1007%2F978-3-319-46604-0_30)
			
 
				+
			
 
				+```
			
 
				+1,600,150,3[S2(4x150)0,2 Ct5,5,16 Mp2,2 Ct5,5,64 Mp3,3
			
 
				+  ([Lrys64 Lbx128][Lbys64 Lbx128][Lfys64 Lbx128]) S3(3x0)2,3
			
 
				+  Lfx128 Lrx128 S0(1x4)0,3 Lfx256]O1c134
			
 
				+```
			
 
				+
			
 
				+Since networks are usually illustrated with the input at the bottom, the input
			
 
				+layer is at the bottom, output at the top, with 'headings' *below* the section
			
 
				+they introduce.
			
 
				+
			
 
				+```
			
 
				+O1c134: Output is a 1-d sequence, trained with CTC and 134 output softmax.
			
 
				+Lfx256: Forward-only LSTM with 256 outputs
			
 
				+S0(1x4)0,3: Reshape transferring the batch of 4 tiles to the depth dimension.
			
 
				+Lrx128: Reverse-only LSTM with 128 outputs
			
 
				+Lfx128: Forward-only LSTM with 128 outputs
			
 
				+(Final section above)
			
 
				+S3(3x0)2,3: Split the outputs of the 3 parallel summarizers and spread over the
			
 
				+  x-dimension
			
 
				+  [Lfys64 Lbx128]: Summarizing LSTM downwards on the y-dimension with 64
			
 
				+    outputs, followed by a bi-directional LSTM in the x-dimension with 128
			
 
				+    outputs
			
 
				+  [Lbys64 Lbx128]: Summarizing bi-directional LSTM on the y-dimension with
			
 
				+    64 outputs, followed by a bi-directional LSTM in the x-dimension with 128
			
 
				+    outputs
			
 
				+  [Lrys64 Lbx128]: Summarizing LSTM upwards on the y-dimension with 64 outputs,
			
 
				+    followed by a bi-directional LSTM in the x-dimension with 128 outputs
			
 
				+(): In parallel (re-using the inputs and concatenating the outputs):
			
 
				+(Summarizing section above)
			
 
				+Mp3,3: 3 x 3 Maxpool
			
 
				+Ct5,5,64: 5 x 5 Convolution with 64 outputs and tanh non-linearity
			
 
				+Mp2,2: 2 x 2 Maxpool
			
 
				+Ct5,5,16: 5 x 5 Convolution with 16 outputs and tanh non-linearity
			
 
				+S2(4x150)0,2: Split the x-dimension into 4x150, converting each tiled 600x150
			
 
				+image into a batch of 4 150x150 images
			
 
				+(Convolutional input section above)
			
 
				+[]: The body of the graph is alway expressed as a series of layers.
			
 
				+1,150,600,3: Input is a batch of 1, 600x150 image in 24 bit color
			
 
				+```
			
 
				+
			
 
				+## Variable size Tensors Under the Hood
			
 
				+
			
 
				+Here are some notes about handling variable-sized images since they require some
			
 
				+consideration and a little bit of knowledge about what goes on inside.
			
 
				+
			
 
				+A variable-sized image is an input for which the width and/or height are not
			
 
				+known at graph-building time, so the tensor shape contains unknown/None/-1
			
 
				+sizes.
			
 
				+
			
 
				+Many standard NN layers, such as convolutions, are designed to cope naturally
			
 
				+with variable-sized images in TF and produce a variable sized image as the
			
 
				+output. For other layers, such as 'Fully connected' variable size is
			
 
				+fundamentally difficult, if not impossible to deal with, since by definition,
			
 
				+*all* its inputs are connected via a weight to an output. The number of inputs
			
 
				+therefore must be fixed.
			
 
				+
			
 
				+It is possible to handle variable sized images by using sparse tensors. Some
			
 
				+implementations make a single variable dimension a list instead of part of the
			
 
				+tensor. Both these solutions suffer from completely segregating the world of
			
 
				+variable size from the world of fixed size, making models and their descriptions
			
 
				+completely non-interchangeable.
			
 
				+
			
 
				+In VGSL, we use a standard 4-d Tensor, `[batch, height, width, depth]` and
			
 
				+either use a batch size of 1 or put up with padding of the input images to the
			
 
				+largest size of any element of the batch. The other price paid for this
			
 
				+standardization is that the user must supply a pair of tensors of shape [batch]
			
 
				+specifying the width and height of each input in a batch. This allows the LSTMs
			
 
				+in the graph to know how many iterations to execute and how to correctly
			
 
				+back-propagate the gradients.
			
 
				+
			
 
				+The standard TF implementation of CTC also requires a tensor giving the sequence
			
 
				+lengths of its inputs. If the output of VGSL is going into CTC, the lengths can
			
 
				+be obtained using:
			
 
				+
			
 
				+```python
			
 
				+import vgslspecs
			
 
				+...
			
 
				+spec = '[Ct5,5,16 Mp3,3 Lfys64 Lfx128 Lrx128 Lfx256]'
			
 
				+vgsl = vgslspecs.VGSLSpecs(widths, heights, is_training=True)
			
 
				+last_layer = vgsl.Build(images, spec)
			
 
				+seq_lengths = vgsl.GetLengths()
			
 
				+```
			
 
				+
			
 
				+The above will provide the widths that were given in the constructor, scaled
			
 
				+down by the max-pool operator. The heights may be obtained using
			
 
				+`vgsl.GetLengths(1)`, specifying the index of the y-dimension.
			
 
				+
			
 
				+NOTE that currently the only way of collapsing a dimension of unknown size to
			
 
				+known size (1) is through the use of a summarizing LSTM. A single summarizing
			
 
				+LSTM will collapse one dimension (x or y), leaving a 1-d sequence. The 1-d
			
 
				+sequence can then be collapsed in the other dimension to make a 0-d categorical
			
 
				+(softmax) or embedding (logistic) output.
			
 
				+
			
 
				+Using the (parallel) op it is entirely possible to run multiple [series] of ops
			
 
				+that collapse x first in one and y first in the other, reducing both eventually
			
 
				+to a single categorical value! For eample, the following description may do
			
 
				+something useful with ImageNet-like problems:
			
 
				+
			
 
				+```python
			
 
				+[Cr5,5,16 Mp2,2 Cr5,5,64 Mp3,3 ([Lfxs64 Lfys256] [Lfys64 Lfxs256]) Fr512 Fr512]
			
 
				+```
			
--- a/street/python/decoder.py
+++ b/street/python/decoder.py
@@ -0,0 +1,243 @@
 
				+# Copyright 2016 The TensorFlow Authors. All Rights Reserved.
			
 
				+#
			
 
				+# Licensed under the Apache License, Version 2.0 (the "License");
			
 
				+# you may not use this file except in compliance with the License.
			
 
				+# You may obtain a copy of the License at
			
 
				+#
			
 
				+#     http://www.apache.org/licenses/LICENSE-2.0
			
 
				+#
			
 
				+# Unless required by applicable law or agreed to in writing, software
			
 
				+# distributed under the License is distributed on an "AS IS" BASIS,
			
 
				+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
			
 
				+# See the License for the specific language governing permissions and
			
 
				+# limitations under the License.
			
 
				+# ==============================================================================
			
 
				+
			
 
				+"""Basic CTC+recoder decoder.
			
 
				+
			
 
				+Decodes a sequence of class-ids into UTF-8 text.
			
 
				+For basic information on CTC See:
			
 
				+Alex Graves et al. Connectionist Temporal Classification: Labelling Unsegmented
			
 
				+Sequence Data with Recurrent Neural Networks.
			
 
				+http://www.cs.toronto.edu/~graves/icml_2006.pdf
			
 
				+"""
			
 
				+import collections
			
 
				+import re
			
 
				+
			
 
				+import errorcounter as ec
			
 
				+import tensorflow as tf
			
 
				+
			
 
				+# Named tuple Part describes a part of a multi (1 or more) part code that
			
 
				+# represents a utf-8 string. For example, Chinese character 'x' might be
			
 
				+# represented by 3 codes of which (utf8='x', index=1, num_codes3) would be the
			
 
				+# middle part. (The actual code is not stored in the tuple).
			
 
				+Part = collections.namedtuple('Part', 'utf8 index, num_codes')
			
 
				+
			
 
				+
			
 
				+# Class that decodes a sequence of class-ids into UTF-8 text.
			
 
				+class Decoder(object):
			
 
				+  """Basic CTC+recoder decoder."""
			
 
				+
			
 
				+  def __init__(self, filename):
			
 
				+    r"""Constructs a Decoder.
			
 
				+
			
 
				+    Reads the text file describing the encoding and build the encoder.
			
 
				+    The text file contains lines of the form:
			
 
				+    <code>[,<code>]*\t<string>
			
 
				+    Each line defines a mapping from a sequence of one or more integer codes to
			
 
				+    a corresponding utf-8 string.
			
 
				+    Args:
			
 
				+      filename:   Name of file defining the decoding sequences.
			
 
				+    """
			
 
				+    # self.decoder is a list of lists of Part(utf8, index, num_codes).
			
 
				+    # The index to the top-level list is a code. The list given by the code
			
 
				+    # index is a list of the parts represented by that code, Eg if the code 42
			
 
				+    # represents the 2nd (index 1) out of 3 part of Chinese character 'x', then
			
 
				+    # self.decoder[42] = [..., (utf8='x', index=1, num_codes3), ...] where ...
			
 
				+    # means all other uses of the code 42.
			
 
				+    self.decoder = []
			
 
				+    if filename:
			
 
				+      self._InitializeDecoder(filename)
			
 
				+
			
 
				+  def SoftmaxEval(self, sess, model, num_steps):
			
 
				+    """Evaluate a model in softmax mode.
			
 
				+
			
 
				+    Adds char, word recall and sequence error rate events to the sw summary
			
 
				+    writer, and returns them as well
			
 
				+    TODO(rays) Add LogisticEval.
			
 
				+    Args:
			
 
				+      sess:  A tensor flow Session.
			
 
				+      model: The model to run in the session. Requires a VGSLImageModel or any
			
 
				+        other class that has a using_ctc attribute and a RunAStep(sess) method
			
 
				+        that reurns a softmax result with corresponding labels.
			
 
				+      num_steps: Number of steps to evaluate for.
			
 
				+    Returns:
			
 
				+      ErrorRates named tuple.
			
 
				+    Raises:
			
 
				+      ValueError: If an unsupported number of dimensions is used.
			
 
				+    """
			
 
				+    coord = tf.train.Coordinator()
			
 
				+    threads = tf.train.start_queue_runners(sess=sess, coord=coord)
			
 
				+    # Run the requested number of evaluation steps, gathering the outputs of the
			
 
				+    # softmax and the true labels of the evaluation examples.
			
 
				+    total_label_counts = ec.ErrorCounts(0, 0, 0, 0)
			
 
				+    total_word_counts = ec.ErrorCounts(0, 0, 0, 0)
			
 
				+    sequence_errors = 0
			
 
				+    for _ in xrange(num_steps):
			
 
				+      softmax_result, labels = model.RunAStep(sess)
			
 
				+      # Collapse softmax to same shape as labels.
			
 
				+      predictions = softmax_result.argmax(axis=-1)
			
 
				+      # Exclude batch from num_dims.
			
 
				+      num_dims = len(predictions.shape) - 1
			
 
				+      batch_size = predictions.shape[0]
			
 
				+      null_label = softmax_result.shape[-1] - 1
			
 
				+      for b in xrange(batch_size):
			
 
				+        if num_dims == 2:
			
 
				+          # TODO(rays) Support 2-d data.
			
 
				+          raise ValueError('2-d label data not supported yet!')
			
 
				+        else:
			
 
				+          if num_dims == 1:
			
 
				+            pred_batch = predictions[b, :]
			
 
				+            labels_batch = labels[b, :]
			
 
				+          else:
			
 
				+            pred_batch = [predictions[b]]
			
 
				+            labels_batch = [labels[b]]
			
 
				+          text = self.StringFromCTC(pred_batch, model.using_ctc, null_label)
			
 
				+          truth = self.StringFromCTC(labels_batch, False, null_label)
			
 
				+          # Note that recall_errs is false negatives (fn) aka drops/deletions.
			
 
				+          # Actual recall would be 1-fn/truth_words.
			
 
				+          # Likewise precision_errs is false positives (fp) aka adds/insertions.
			
 
				+          # Actual precision would be 1-fp/ocr_words.
			
 
				+          total_word_counts = ec.AddErrors(total_word_counts,
			
 
				+                                           ec.CountWordErrors(text, truth))
			
 
				+          total_label_counts = ec.AddErrors(total_label_counts,
			
 
				+                                            ec.CountErrors(text, truth))
			
 
				+          if text != truth:
			
 
				+            sequence_errors += 1
			
 
				+
			
 
				+    coord.request_stop()
			
 
				+    coord.join(threads)
			
 
				+    return ec.ComputeErrorRates(total_label_counts, total_word_counts,
			
 
				+                                sequence_errors, num_steps * batch_size)
			
 
				+
			
 
				+  def StringFromCTC(self, ctc_labels, merge_dups, null_label):
			
 
				+    """Decodes CTC output to a string.
			
 
				+
			
 
				+    Extracts only sequences of codes that are allowed by self.decoder.
			
 
				+    Labels that make illegal code sequences are dropped.
			
 
				+    Note that, by its nature of taking only top choices, this is much weaker
			
 
				+    than a full-blown beam search that considers all the softmax outputs.
			
 
				+    For languages without many multi-code sequences, this doesn't make much
			
 
				+    difference, but for complex scripts the accuracy will be much lower.
			
 
				+    Args:
			
 
				+      ctc_labels: List of class labels including null characters to remove.
			
 
				+      merge_dups: If True, Duplicate labels will be merged
			
 
				+      null_label: Label value to ignore.
			
 
				+
			
 
				+    Returns:
			
 
				+      Labels decoded to a string.
			
 
				+    """
			
 
				+    # Run regular ctc on the labels, extracting a list of codes.
			
 
				+    codes = self._CodesFromCTC(ctc_labels, merge_dups, null_label)
			
 
				+    length = len(codes)
			
 
				+    if length == 0:
			
 
				+      return ''
			
 
				+    # strings and partials are both indexed by the same index as codes.
			
 
				+    # strings[i] is the best completed string upto position i, and
			
 
				+    # partials[i] is a list of partial code sequences at position i.
			
 
				+    # Warning: memory is squared-order in length.
			
 
				+    strings = []
			
 
				+    partials = []
			
 
				+    for pos in xrange(length):
			
 
				+      code = codes[pos]
			
 
				+      parts = self.decoder[code]
			
 
				+      partials.append([])
			
 
				+      strings.append('')
			
 
				+      # Iterate over the parts that this code can represent.
			
 
				+      for utf8, index, num_codes in parts:
			
 
				+        if index > pos:
			
 
				+          continue
			
 
				+        # We can use code if it is an initial code (index==0) or continues a
			
 
				+        # sequence in the partials list at the previous position.
			
 
				+        if index == 0 or partials[pos - 1].count(
			
 
				+            Part(utf8, index - 1, num_codes)) > 0:
			
 
				+          if index < num_codes - 1:
			
 
				+            # Save the partial sequence.
			
 
				+            partials[-1].append(Part(utf8, index, num_codes))
			
 
				+          elif not strings[-1]:
			
 
				+            # A code sequence is completed. Append to the best string that we
			
 
				+            # had where it started.
			
 
				+            if pos >= num_codes:
			
 
				+              strings[-1] = strings[pos - num_codes] + utf8
			
 
				+            else:
			
 
				+              strings[-1] = utf8
			
 
				+      if not strings[-1] and pos > 0:
			
 
				+        # We didn't get anything here so copy the previous best string, skipping
			
 
				+        # the current code, but it may just be a partial anyway.
			
 
				+        strings[-1] = strings[-2]
			
 
				+    return strings[-1]
			
 
				+
			
 
				+  def _InitializeDecoder(self, filename):
			
 
				+    """Reads the decoder file and initializes self.decoder from it.
			
 
				+
			
 
				+    Args:
			
 
				+      filename: Name of text file mapping codes to utf8 strings.
			
 
				+    Raises:
			
 
				+      ValueError: if the input file is not parsed correctly.
			
 
				+    """
			
 
				+    line_re = re.compile(r'(?P<codes>\d+(,\d+)*)\t(?P<utf8>.+)')
			
 
				+    with tf.gfile.GFile(filename) as f:
			
 
				+      for line in f:
			
 
				+        m = line_re.match(line)
			
 
				+        if m is None:
			
 
				+          raise ValueError('Unmatched line:', line)
			
 
				+        # codes is the sequence that maps to the string.
			
 
				+        str_codes = m.groupdict()['codes'].split(',')
			
 
				+        codes = []
			
 
				+        for code in str_codes:
			
 
				+          codes.append(int(code))
			
 
				+        utf8 = m.groupdict()['utf8']
			
 
				+        num_codes = len(codes)
			
 
				+        for index, code in enumerate(codes):
			
 
				+          while code >= len(self.decoder):
			
 
				+            self.decoder.append([])
			
 
				+          self.decoder[code].append(Part(utf8, index, num_codes))
			
 
				+
			
 
				+  def _CodesFromCTC(self, ctc_labels, merge_dups, null_label):
			
 
				+    """Collapses CTC output to regular output.
			
 
				+
			
 
				+    Args:
			
 
				+      ctc_labels: List of class labels including null characters to remove.
			
 
				+      merge_dups: If True, Duplicate labels will be merged.
			
 
				+      null_label: Label value to ignore.
			
 
				+
			
 
				+    All trailing zeros are removed!!
			
 
				+    TODO(rays) This may become a problem with non-CTC models.
			
 
				+    If using charset, this should not be a problem as zero is always space.
			
 
				+    tf.pad can only append zero, so we have to be able to drop them, as a
			
 
				+    non-ctc will have learned to output trailing zeros instead of trailing
			
 
				+    nulls. This is awkward, as the stock ctc loss function requires that the
			
 
				+    null character be num_classes-1.
			
 
				+    Returns:
			
 
				+      (List of) Labels with null characters removed.
			
 
				+    """
			
 
				+    out_labels = []
			
 
				+    prev_label = -1
			
 
				+    zeros_needed = 0
			
 
				+    for label in ctc_labels:
			
 
				+      if label == null_label:
			
 
				+        prev_label = -1
			
 
				+      elif label != prev_label or not merge_dups:
			
 
				+        if label == 0:
			
 
				+          # Count zeros and only emit them when it is clear there is a non-zero
			
 
				+          # after, so as to truncate away all trailing zeros.
			
 
				+          zeros_needed += 1
			
 
				+        else:
			
 
				+          if merge_dups and zeros_needed > 0:
			
 
				+            out_labels.append(0)
			
 
				+          else:
			
 
				+            out_labels += [0] * zeros_needed
			
 
				+          zeros_needed = 0
			
 
				+          out_labels.append(label)
			
 
				+        prev_label = label
			
 
				+    return out_labels
			
--- a/street/python/decoder_test.py
+++ b/street/python/decoder_test.py
@@ -0,0 +1,57 @@
 
				+# Copyright 2016 The TensorFlow Authors. All Rights Reserved.
			
 
				+#
			
 
				+# Licensed under the Apache License, Version 2.0 (the "License");
			
 
				+# you may not use this file except in compliance with the License.
			
 
				+# You may obtain a copy of the License at
			
 
				+#
			
 
				+#     http://www.apache.org/licenses/LICENSE-2.0
			
 
				+#
			
 
				+# Unless required by applicable law or agreed to in writing, software
			
 
				+# distributed under the License is distributed on an "AS IS" BASIS,
			
 
				+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
			
 
				+# See the License for the specific language governing permissions and
			
 
				+# limitations under the License.
			
 
				+# ==============================================================================
			
 
				+"""Tests for decoder."""
			
 
				+import os
			
 
				+
			
 
				+import tensorflow as tf
			
 
				+import decoder
			
 
				+
			
 
				+
			
 
				+def _testdata(filename):
			
 
				+  return os.path.join('../testdata/', filename)
			
 
				+
			
 
				+
			
 
				+class DecoderTest(tf.test.TestCase):
			
 
				+
			
 
				+  def testCodesFromCTC(self):
			
 
				+    """Tests that the simple CTC decoder drops nulls and duplicates.
			
 
				+    """
			
 
				+    ctc_labels = [9, 9, 9, 1, 9, 2, 2, 3, 9, 9, 0, 0, 1, 9, 1, 9, 9, 9]
			
 
				+    decode = decoder.Decoder(filename=None)
			
 
				+    non_null_labels = decode._CodesFromCTC(
			
 
				+        ctc_labels, merge_dups=False, null_label=9)
			
 
				+    self.assertEqual(non_null_labels, [1, 2, 2, 3, 0, 0, 1, 1])
			
 
				+    idempotent_labels = decode._CodesFromCTC(
			
 
				+        non_null_labels, merge_dups=False, null_label=9)
			
 
				+    self.assertEqual(idempotent_labels, non_null_labels)
			
 
				+    collapsed_labels = decode._CodesFromCTC(
			
 
				+        ctc_labels, merge_dups=True, null_label=9)
			
 
				+    self.assertEqual(collapsed_labels, [1, 2, 3, 0, 1, 1])
			
 
				+    non_idempotent_labels = decode._CodesFromCTC(
			
 
				+        collapsed_labels, merge_dups=True, null_label=9)
			
 
				+    self.assertEqual(non_idempotent_labels, [1, 2, 3, 0, 1])
			
 
				+
			
 
				+  def testStringFromCTC(self):
			
 
				+    """Tests that the decoder can decode sequences including multi-codes.
			
 
				+    """
			
 
				+    #             -  f  -  a  r  -  m(1/2)m     -junk sp b  a  r  -  n  -
			
 
				+    ctc_labels = [9, 6, 9, 1, 3, 9, 4, 9, 5, 5, 9, 5, 0, 2, 1, 3, 9, 4, 9]
			
 
				+    decode = decoder.Decoder(filename=_testdata('charset_size_10.txt'))
			
 
				+    text = decode.StringFromCTC(ctc_labels, merge_dups=True, null_label=9)
			
 
				+    self.assertEqual(text, 'farm barn')
			
 
				+
			
 
				+
			
 
				+if __name__ == '__main__':
			
 
				+  tf.test.main()
			
--- a/street/python/errorcounter.py
+++ b/street/python/errorcounter.py
@@ -0,0 +1,123 @@
 
				+# Copyright 2016 The TensorFlow Authors. All Rights Reserved.
			
 
				+#
			
 
				+# Licensed under the Apache License, Version 2.0 (the "License");
			
 
				+# you may not use this file except in compliance with the License.
			
 
				+# You may obtain a copy of the License at
			
 
				+#
			
 
				+#     http://www.apache.org/licenses/LICENSE-2.0
			
 
				+#
			
 
				+# Unless required by applicable law or agreed to in writing, software
			
 
				+# distributed under the License is distributed on an "AS IS" BASIS,
			
 
				+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
			
 
				+# See the License for the specific language governing permissions and
			
 
				+# limitations under the License.
			
 
				+# ==============================================================================
			
 
				+
			
 
				+"""Some simple tools for error counting.
			
 
				+
			
 
				+"""
			
 
				+import collections
			
 
				+
			
 
				+# Named tuple Error counts describes the counts needed to accumulate errors
			
 
				+# over multiple trials:
			
 
				+#   false negatives (aka drops or deletions),
			
 
				+#   false positives: (aka adds or insertions),
			
 
				+#   truth_count: number of elements in ground truth = denominator for fn,
			
 
				+#   test_count:  number of elements in test string = denominator for fp,
			
 
				+# Note that recall = 1 - fn/truth_count, precision = 1 - fp/test_count,
			
 
				+# accuracy = 1 - (fn + fp) / (truth_count + test_count).
			
 
				+ErrorCounts = collections.namedtuple('ErrorCounts', ['fn', 'fp', 'truth_count',
			
 
				+                                                     'test_count'])
			
 
				+
			
 
				+# Named tuple for error rates, as a percentage. Accuracies are just 100-error.
			
 
				+ErrorRates = collections.namedtuple('ErrorRates',
			
 
				+                                    ['label_error', 'word_recall_error',
			
 
				+                                     'word_precision_error', 'sequence_error'])
			
 
				+
			
 
				+
			
 
				+def CountWordErrors(ocr_text, truth_text):
			
 
				+  """Counts the word drop and add errors as a bag of words.
			
 
				+
			
 
				+  Args:
			
 
				+    ocr_text:    OCR text string.
			
 
				+    truth_text:  Truth text string.
			
 
				+
			
 
				+  Returns:
			
 
				+    ErrorCounts named tuple.
			
 
				+  """
			
 
				+  # Convert to lists of words.
			
 
				+  return CountErrors(ocr_text.split(), truth_text.split())
			
 
				+
			
 
				+
			
 
				+def CountErrors(ocr_text, truth_text):
			
 
				+  """Counts the drops and adds between 2 bags of iterables.
			
 
				+
			
 
				+  Simple bag of objects count returns the number of dropped and added
			
 
				+  elements, regardless of order, from anything that is iterable, eg
			
 
				+  a pair of strings gives character errors, and a pair of word lists give
			
 
				+  word errors.
			
 
				+  Args:
			
 
				+    ocr_text:    OCR text iterable (eg string for chars, word list for words).
			
 
				+    truth_text:  Truth text iterable.
			
 
				+
			
 
				+  Returns:
			
 
				+    ErrorCounts named tuple.
			
 
				+  """
			
 
				+  counts = collections.Counter(truth_text)
			
 
				+  counts.subtract(ocr_text)
			
 
				+  drops = sum(c for c in counts.values() if c > 0)
			
 
				+  adds = sum(-c for c in counts.values() if c < 0)
			
 
				+  return ErrorCounts(drops, adds, len(truth_text), len(ocr_text))
			
 
				+
			
 
				+
			
 
				+def AddErrors(counts1, counts2):
			
 
				+  """Adds the counts and returns a new sum tuple.
			
 
				+
			
 
				+  Args:
			
 
				+    counts1: ErrorCounts named tuples to sum.
			
 
				+    counts2: ErrorCounts named tuples to sum.
			
 
				+  Returns:
			
 
				+    Sum of counts1, counts2.
			
 
				+  """
			
 
				+  return ErrorCounts(counts1.fn + counts2.fn, counts1.fp + counts2.fp,
			
 
				+                     counts1.truth_count + counts2.truth_count,
			
 
				+                     counts1.test_count + counts2.test_count)
			
 
				+
			
 
				+
			
 
				+def ComputeErrorRates(label_counts, word_counts, seq_errors, num_seqs):
			
 
				+  """Returns an ErrorRates corresponding to the given counts.
			
 
				+
			
 
				+  Args:
			
 
				+    label_counts: ErrorCounts for the character labels
			
 
				+    word_counts:  ErrorCounts for the words
			
 
				+    seq_errors:   Number of sequence errors
			
 
				+    num_seqs:     Total sequences
			
 
				+  Returns:
			
 
				+    ErrorRates corresponding to the given counts.
			
 
				+  """
			
 
				+  label_errors = label_counts.fn + label_counts.fp
			
 
				+  num_labels = label_counts.truth_count + label_counts.test_count
			
 
				+  return ErrorRates(
			
 
				+      ComputeErrorRate(label_errors, num_labels),
			
 
				+      ComputeErrorRate(word_counts.fn, word_counts.truth_count),
			
 
				+      ComputeErrorRate(word_counts.fp, word_counts.test_count),
			
 
				+      ComputeErrorRate(seq_errors, num_seqs))
			
 
				+
			
 
				+
			
 
				+def ComputeErrorRate(error_count, truth_count):
			
 
				+  """Returns a sanitized percent error rate from the raw counts.
			
 
				+
			
 
				+  Prevents div by 0 and clips return to 100%.
			
 
				+  Args:
			
 
				+    error_count: Number of errors.
			
 
				+    truth_count: Number to divide by.
			
 
				+
			
 
				+  Returns:
			
 
				+    100.0 * error_count / truth_count clipped to 100.
			
 
				+  """
			
 
				+  if truth_count == 0:
			
 
				+    truth_count = 1
			
 
				+    error_count = 1
			
 
				+  elif error_count > truth_count:
			
 
				+    error_count = truth_count
			
 
				+  return error_count * 100.0 / truth_count
			
--- a/street/python/errorcounter_test.py
+++ b/street/python/errorcounter_test.py
@@ -0,0 +1,124 @@
 
				+# Copyright 2016 The TensorFlow Authors. All Rights Reserved.
			
 
				+#
			
 
				+# Licensed under the Apache License, Version 2.0 (the "License");
			
 
				+# you may not use this file except in compliance with the License.
			
 
				+# You may obtain a copy of the License at
			
 
				+#
			
 
				+#     http://www.apache.org/licenses/LICENSE-2.0
			
 
				+#
			
 
				+# Unless required by applicable law or agreed to in writing, software
			
 
				+# distributed under the License is distributed on an "AS IS" BASIS,
			
 
				+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
			
 
				+# See the License for the specific language governing permissions and
			
 
				+# limitations under the License.
			
 
				+# ==============================================================================
			
 
				+"""Tests for errorcounter."""
			
 
				+import tensorflow as tf
			
 
				+import errorcounter as ec
			
 
				+
			
 
				+
			
 
				+class ErrorcounterTest(tf.test.TestCase):
			
 
				+
			
 
				+  def testComputeErrorRate(self):
			
 
				+    """Tests that the percent calculation works as expected.
			
 
				+    """
			
 
				+    rate = ec.ComputeErrorRate(error_count=0, truth_count=0)
			
 
				+    self.assertEqual(rate, 100.0)
			
 
				+    rate = ec.ComputeErrorRate(error_count=1, truth_count=0)
			
 
				+    self.assertEqual(rate, 100.0)
			
 
				+    rate = ec.ComputeErrorRate(error_count=10, truth_count=1)
			
 
				+    self.assertEqual(rate, 100.0)
			
 
				+    rate = ec.ComputeErrorRate(error_count=0, truth_count=1)
			
 
				+    self.assertEqual(rate, 0.0)
			
 
				+    rate = ec.ComputeErrorRate(error_count=3, truth_count=12)
			
 
				+    self.assertEqual(rate, 25.0)
			
 
				+
			
 
				+  def testCountErrors(self):
			
 
				+    """Tests that the error counter works as expected.
			
 
				+    """
			
 
				+    truth_str = 'farm barn'
			
 
				+    counts = ec.CountErrors(ocr_text=truth_str, truth_text=truth_str)
			
 
				+    self.assertEqual(
			
 
				+        counts, ec.ErrorCounts(
			
 
				+            fn=0, fp=0, truth_count=9, test_count=9))
			
 
				+    # With a period on the end, we get a char error.
			
 
				+    dot_str = 'farm barn.'
			
 
				+    counts = ec.CountErrors(ocr_text=dot_str, truth_text=truth_str)
			
 
				+    self.assertEqual(
			
 
				+        counts, ec.ErrorCounts(
			
 
				+            fn=0, fp=1, truth_count=9, test_count=10))
			
 
				+    counts = ec.CountErrors(ocr_text=truth_str, truth_text=dot_str)
			
 
				+    self.assertEqual(
			
 
				+        counts, ec.ErrorCounts(
			
 
				+            fn=1, fp=0, truth_count=10, test_count=9))
			
 
				+    # Space is just another char.
			
 
				+    no_space = 'farmbarn'
			
 
				+    counts = ec.CountErrors(ocr_text=no_space, truth_text=truth_str)
			
 
				+    self.assertEqual(
			
 
				+        counts, ec.ErrorCounts(
			
 
				+            fn=1, fp=0, truth_count=9, test_count=8))
			
 
				+    counts = ec.CountErrors(ocr_text=truth_str, truth_text=no_space)
			
 
				+    self.assertEqual(
			
 
				+        counts, ec.ErrorCounts(
			
 
				+            fn=0, fp=1, truth_count=8, test_count=9))
			
 
				+    # Lose them all.
			
 
				+    counts = ec.CountErrors(ocr_text='', truth_text=truth_str)
			
 
				+    self.assertEqual(
			
 
				+        counts, ec.ErrorCounts(
			
 
				+            fn=9, fp=0, truth_count=9, test_count=0))
			
 
				+    counts = ec.CountErrors(ocr_text=truth_str, truth_text='')
			
 
				+    self.assertEqual(
			
 
				+        counts, ec.ErrorCounts(
			
 
				+            fn=0, fp=9, truth_count=0, test_count=9))
			
 
				+
			
 
				+  def testCountWordErrors(self):
			
 
				+    """Tests that the error counter works as expected.
			
 
				+    """
			
 
				+    truth_str = 'farm barn'
			
 
				+    counts = ec.CountWordErrors(ocr_text=truth_str, truth_text=truth_str)
			
 
				+    self.assertEqual(
			
 
				+        counts, ec.ErrorCounts(
			
 
				+            fn=0, fp=0, truth_count=2, test_count=2))
			
 
				+    # With a period on the end, we get a word error.
			
 
				+    dot_str = 'farm barn.'
			
 
				+    counts = ec.CountWordErrors(ocr_text=dot_str, truth_text=truth_str)
			
 
				+    self.assertEqual(
			
 
				+        counts, ec.ErrorCounts(
			
 
				+            fn=1, fp=1, truth_count=2, test_count=2))
			
 
				+    counts = ec.CountWordErrors(ocr_text=truth_str, truth_text=dot_str)
			
 
				+    self.assertEqual(
			
 
				+        counts, ec.ErrorCounts(
			
 
				+            fn=1, fp=1, truth_count=2, test_count=2))
			
 
				+    # Space is special.
			
 
				+    no_space = 'farmbarn'
			
 
				+    counts = ec.CountWordErrors(ocr_text=no_space, truth_text=truth_str)
			
 
				+    self.assertEqual(
			
 
				+        counts, ec.ErrorCounts(
			
 
				+            fn=2, fp=1, truth_count=2, test_count=1))
			
 
				+    counts = ec.CountWordErrors(ocr_text=truth_str, truth_text=no_space)
			
 
				+    self.assertEqual(
			
 
				+        counts, ec.ErrorCounts(
			
 
				+            fn=1, fp=2, truth_count=1, test_count=2))
			
 
				+    # Lose them all.
			
 
				+    counts = ec.CountWordErrors(ocr_text='', truth_text=truth_str)
			
 
				+    self.assertEqual(
			
 
				+        counts, ec.ErrorCounts(
			
 
				+            fn=2, fp=0, truth_count=2, test_count=0))
			
 
				+    counts = ec.CountWordErrors(ocr_text=truth_str, truth_text='')
			
 
				+    self.assertEqual(
			
 
				+        counts, ec.ErrorCounts(
			
 
				+            fn=0, fp=2, truth_count=0, test_count=2))
			
 
				+    # With a space in ba rn, there is an extra add.
			
 
				+    sp_str = 'farm ba rn'
			
 
				+    counts = ec.CountWordErrors(ocr_text=sp_str, truth_text=truth_str)
			
 
				+    self.assertEqual(
			
 
				+        counts, ec.ErrorCounts(
			
 
				+            fn=1, fp=2, truth_count=2, test_count=3))
			
 
				+    counts = ec.CountWordErrors(ocr_text=truth_str, truth_text=sp_str)
			
 
				+    self.assertEqual(
			
 
				+        counts, ec.ErrorCounts(
			
 
				+            fn=2, fp=1, truth_count=3, test_count=2))
			
 
				+
			
 
				+
			
 
				+if __name__ == '__main__':
			
 
				+  tf.test.main()
			
--- a/street/python/nn_ops.py
+++ b/street/python/nn_ops.py
@@ -0,0 +1,253 @@
 
				+# Copyright 2016 The TensorFlow Authors. All Rights Reserved.
			
 
				+#
			
 
				+# Licensed under the Apache License, Version 2.0 (the "License");
			
 
				+# you may not use this file except in compliance with the License.
			
 
				+# You may obtain a copy of the License at
			
 
				+#
			
 
				+#     http://www.apache.org/licenses/LICENSE-2.0
			
 
				+#
			
 
				+# Unless required by applicable law or agreed to in writing, software
			
 
				+# distributed under the License is distributed on an "AS IS" BASIS,
			
 
				+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
			
 
				+# See the License for the specific language governing permissions and
			
 
				+# limitations under the License.
			
 
				+# ==============================================================================
			
 
				+
			
 
				+"""Ops and utilities for neural networks.
			
 
				+
			
 
				+For now, just an LSTM layer.
			
 
				+"""
			
 
				+import shapes
			
 
				+import tensorflow as tf
			
 
				+rnn = tf.load_op_library("../cc/rnn_ops.so")
			
 
				+
			
 
				+
			
 
				+def rnn_helper(inp,
			
 
				+               length,
			
 
				+               cell_type=None,
			
 
				+               direction="forward",
			
 
				+               name=None,
			
 
				+               *args,
			
 
				+               **kwargs):
			
 
				+  """Adds ops for a recurrent neural network layer.
			
 
				+
			
 
				+  This function calls an actual implementation of a recurrent neural network
			
 
				+  based on `cell_type`.
			
 
				+
			
 
				+  There are three modes depending on the value of `direction`:
			
 
				+
			
 
				+    forward: Adds a forward RNN.
			
 
				+    backward: Adds a backward RNN.
			
 
				+    bidirectional: Adds both forward and backward RNNs and creates a
			
 
				+                   bidirectional RNN.
			
 
				+
			
 
				+  Args:
			
 
				+    inp: A 3-D tensor of shape [`batch_size`, `max_length`, `feature_dim`].
			
 
				+    length: A 1-D tensor of shape [`batch_size`] and type int64. Each element
			
 
				+            represents the length of the corresponding sequence in `inp`.
			
 
				+    cell_type: Cell type of RNN. Currently can only be "lstm".
			
 
				+    direction: One of "forward", "backward", "bidirectional".
			
 
				+    name: Name of the op.
			
 
				+    *args: Other arguments to the layer.
			
 
				+    **kwargs: Keyword arugments to the layer.
			
 
				+
			
 
				+  Returns:
			
 
				+    A 3-D tensor of shape [`batch_size`, `max_length`, `num_nodes`].
			
 
				+  """
			
 
				+
			
 
				+  assert cell_type is not None
			
 
				+  rnn_func = None
			
 
				+  if cell_type == "lstm":
			
 
				+    rnn_func = lstm_layer
			
 
				+  assert rnn_func is not None
			
 
				+  assert direction in ["forward", "backward", "bidirectional"]
			
 
				+
			
 
				+  with tf.variable_scope(name):
			
 
				+    if direction in ["forward", "bidirectional"]:
			
 
				+      forward = rnn_func(
			
 
				+          inp=inp,
			
 
				+          length=length,
			
 
				+          backward=False,
			
 
				+          name="forward",
			
 
				+          *args,
			
 
				+          **kwargs)
			
 
				+      if isinstance(forward, tuple):
			
 
				+        # lstm_layer returns a tuple (output, memory). We only need the first
			
 
				+        # element.
			
 
				+        forward = forward[0]
			
 
				+    if direction in ["backward", "bidirectional"]:
			
 
				+      backward = rnn_func(
			
 
				+          inp=inp,
			
 
				+          length=length,
			
 
				+          backward=True,
			
 
				+          name="backward",
			
 
				+          *args,
			
 
				+          **kwargs)
			
 
				+      if isinstance(backward, tuple):
			
 
				+        # lstm_layer returns a tuple (output, memory). We only need the first
			
 
				+        # element.
			
 
				+        backward = backward[0]
			
 
				+    if direction == "forward":
			
 
				+      out = forward
			
 
				+    elif direction == "backward":
			
 
				+      out = backward
			
 
				+    else:
			
 
				+      out = tf.concat(2, [forward, backward])
			
 
				+  return out
			
 
				+
			
 
				+
			
 
				+@tf.RegisterShape("VariableLSTM")
			
 
				+def _variable_lstm_shape(op):
			
 
				+  """Shape function for the VariableLSTM op."""
			
 
				+  input_shape = op.inputs[0].get_shape().with_rank(4)
			
 
				+  state_shape = op.inputs[1].get_shape().with_rank(2)
			
 
				+  memory_shape = op.inputs[2].get_shape().with_rank(2)
			
 
				+  w_m_m_shape = op.inputs[3].get_shape().with_rank(3)
			
 
				+  batch_size = input_shape[0].merge_with(state_shape[0])
			
 
				+  batch_size = input_shape[0].merge_with(memory_shape[0])
			
 
				+  seq_len = input_shape[1]
			
 
				+  gate_num = input_shape[2].merge_with(w_m_m_shape[1])
			
 
				+  output_dim = input_shape[3].merge_with(state_shape[1])
			
 
				+  output_dim = output_dim.merge_with(memory_shape[1])
			
 
				+  output_dim = output_dim.merge_with(w_m_m_shape[0])
			
 
				+  output_dim = output_dim.merge_with(w_m_m_shape[2])
			
 
				+  return [[batch_size, seq_len, output_dim],
			
 
				+          [batch_size, seq_len, gate_num, output_dim],
			
 
				+          [batch_size, seq_len, output_dim]]
			
 
				+
			
 
				+
			
 
				+@tf.RegisterGradient("VariableLSTM")
			
 
				+def _variable_lstm_grad(op, act_grad, gate_grad, mem_grad):
			
 
				+  """Gradient function for the VariableLSTM op."""
			
 
				+  initial_state = op.inputs[1]
			
 
				+  initial_memory = op.inputs[2]
			
 
				+  w_m_m = op.inputs[3]
			
 
				+  act = op.outputs[0]
			
 
				+  gate_raw_act = op.outputs[1]
			
 
				+  memory = op.outputs[2]
			
 
				+  return rnn.variable_lstm_grad(initial_state, initial_memory, w_m_m, act,
			
 
				+                                gate_raw_act, memory, act_grad, gate_grad,
			
 
				+                                mem_grad)
			
 
				+
			
 
				+
			
 
				+def lstm_layer(inp,
			
 
				+               length=None,
			
 
				+               state=None,
			
 
				+               memory=None,
			
 
				+               num_nodes=None,
			
 
				+               backward=False,
			
 
				+               clip=50.0,
			
 
				+               reg_func=tf.nn.l2_loss,
			
 
				+               weight_reg=False,
			
 
				+               weight_collection="LSTMWeights",
			
 
				+               bias_reg=False,
			
 
				+               stddev=None,
			
 
				+               seed=None,
			
 
				+               decode=False,
			
 
				+               use_native_weights=False,
			
 
				+               name=None):
			
 
				+  """Adds ops for an LSTM layer.
			
 
				+
			
 
				+  This adds ops for the following operations:
			
 
				+
			
 
				+    input => (forward-LSTM|backward-LSTM) => output
			
 
				+
			
 
				+  The direction of the LSTM is determined by `backward`. If it is false, the
			
 
				+  forward LSTM is used, the backward one otherwise.
			
 
				+
			
 
				+  Args:
			
 
				+    inp: A 3-D tensor of shape [`batch_size`, `max_length`, `feature_dim`].
			
 
				+    length: A 1-D tensor of shape [`batch_size`] and type int64. Each element
			
 
				+            represents the length of the corresponding sequence in `inp`.
			
 
				+    state: If specified, uses it as the initial state.
			
 
				+    memory: If specified, uses it as the initial memory.
			
 
				+    num_nodes: The number of LSTM cells.
			
 
				+    backward: If true, reverses the `inp` before adding the ops. The output is
			
 
				+              also reversed so that the direction is the same as `inp`.
			
 
				+    clip: Value used to clip the cell values.
			
 
				+    reg_func: Function used for the weight regularization such as
			
 
				+              `tf.nn.l2_loss`.
			
 
				+    weight_reg: If true, regularize the filter weights with `reg_func`.
			
 
				+    weight_collection: Collection to add the weights to for regularization.
			
 
				+    bias_reg: If true, regularize the bias vector with `reg_func`.
			
 
				+    stddev: Standard deviation used to initialize the variables.
			
 
				+    seed: Seed used to initialize the variables.
			
 
				+    decode: If true, does not add ops which are not used for inference.
			
 
				+    use_native_weights: If true, uses weights in the same format as the native
			
 
				+                        implementations.
			
 
				+    name: Name of the op.
			
 
				+
			
 
				+  Returns:
			
 
				+    A 3-D tensor of shape [`batch_size`, `max_length`, `num_nodes`].
			
 
				+  """
			
 
				+  with tf.variable_scope(name):
			
 
				+    if backward:
			
 
				+      if length is None:
			
 
				+        inp = tf.reverse(inp, [False, True, False])
			
 
				+      else:
			
 
				+        inp = tf.reverse_sequence(inp, length, 1, 0)
			
 
				+
			
 
				+    num_prev = inp.get_shape()[2]
			
 
				+    if stddev:
			
 
				+      initializer = tf.truncated_normal_initializer(stddev=stddev, seed=seed)
			
 
				+    else:
			
 
				+      initializer = tf.uniform_unit_scaling_initializer(seed=seed)
			
 
				+
			
 
				+    if use_native_weights:
			
 
				+      with tf.variable_scope("LSTMCell"):
			
 
				+        w = tf.get_variable(
			
 
				+            "W_0",
			
 
				+            shape=[num_prev + num_nodes, 4 * num_nodes],
			
 
				+            initializer=initializer,
			
 
				+            dtype=tf.float32)
			
 
				+        w_i_m = tf.slice(w, [0, 0], [num_prev, 4 * num_nodes], name="w_i_m")
			
 
				+        w_m_m = tf.reshape(
			
 
				+            tf.slice(w, [num_prev, 0], [num_nodes, 4 * num_nodes]),
			
 
				+            [num_nodes, 4, num_nodes],
			
 
				+            name="w_m_m")
			
 
				+    else:
			
 
				+      w_i_m = tf.get_variable("w_i_m", [num_prev, 4 * num_nodes],
			
 
				+                              initializer=initializer)
			
 
				+      w_m_m = tf.get_variable("w_m_m", [num_nodes, 4, num_nodes],
			
 
				+                              initializer=initializer)
			
 
				+
			
 
				+    if not decode and weight_reg:
			
 
				+      tf.add_to_collection(weight_collection, reg_func(w_i_m, name="w_i_m_reg"))
			
 
				+      tf.add_to_collection(weight_collection, reg_func(w_m_m, name="w_m_m_reg"))
			
 
				+
			
 
				+    batch_size = shapes.tensor_dim(inp, dim=0)
			
 
				+    num_frames = shapes.tensor_dim(inp, dim=1)
			
 
				+    prev = tf.reshape(inp, tf.pack([batch_size * num_frames, num_prev]))
			
 
				+
			
 
				+    if use_native_weights:
			
 
				+      with tf.variable_scope("LSTMCell"):
			
 
				+        b = tf.get_variable(
			
 
				+            "B",
			
 
				+            shape=[4 * num_nodes],
			
 
				+            initializer=tf.zeros_initializer,
			
 
				+            dtype=tf.float32)
			
 
				+      biases = tf.identity(b, name="biases")
			
 
				+    else:
			
 
				+      biases = tf.get_variable(
			
 
				+          "biases", [4 * num_nodes], initializer=tf.constant_initializer(0.0))
			
 
				+    if not decode and bias_reg:
			
 
				+      tf.add_to_collection(
			
 
				+          weight_collection, reg_func(
			
 
				+              biases, name="biases_reg"))
			
 
				+    prev = tf.nn.xw_plus_b(prev, w_i_m, biases)
			
 
				+
			
 
				+    prev = tf.reshape(prev, tf.pack([batch_size, num_frames, 4, num_nodes]))
			
 
				+    if state is None:
			
 
				+      state = tf.fill(tf.pack([batch_size, num_nodes]), 0.0)
			
 
				+    if memory is None:
			
 
				+      memory = tf.fill(tf.pack([batch_size, num_nodes]), 0.0)
			
 
				+
			
 
				+    out, _, mem = rnn.variable_lstm(prev, state, memory, w_m_m, clip=clip)
			
 
				+
			
 
				+    if backward:
			
 
				+      if length is None:
			
 
				+        out = tf.reverse(out, [False, True, False])
			
 
				+      else:
			
 
				+        out = tf.reverse_sequence(out, length, 1, 0)
			
 
				+
			
 
				+  return out, mem
			
--- a/street/python/shapes.py
+++ b/street/python/shapes.py
@@ -0,0 +1,216 @@
 
				+# Copyright 2016 The TensorFlow Authors. All Rights Reserved.
			
 
				+#
			
 
				+# Licensed under the Apache License, Version 2.0 (the "License");
			
 
				+# you may not use this file except in compliance with the License.
			
 
				+# You may obtain a copy of the License at
			
 
				+#
			
 
				+#     http://www.apache.org/licenses/LICENSE-2.0
			
 
				+#
			
 
				+# Unless required by applicable law or agreed to in writing, software
			
 
				+# distributed under the License is distributed on an "AS IS" BASIS,
			
 
				+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
			
 
				+# See the License for the specific language governing permissions and
			
 
				+# limitations under the License.
			
 
				+# ==============================================================================
			
 
				+
			
 
				+"""Shape manipulation functions.
			
 
				+
			
 
				+rotate_dimensions: prepares for a rotating transpose by returning a rotated
			
 
				+  list of dimension indices.
			
 
				+transposing_reshape: allows a dimension to be factorized, with one of the pieces
			
 
				+  transferred to another dimension, or to transpose factors within a single
			
 
				+  dimension.
			
 
				+tensor_dim: gets a shape dimension as a constant integer if known otherwise a
			
 
				+  runtime usable tensor value.
			
 
				+tensor_shape: returns the full shape of a tensor as the tensor_dim.
			
 
				+"""
			
 
				+import tensorflow as tf
			
 
				+
			
 
				+
			
 
				+def rotate_dimensions(num_dims, src_dim, dest_dim):
			
 
				+  """Returns a list of dimension indices that will rotate src_dim to dest_dim.
			
 
				+
			
 
				+  src_dim is moved to dest_dim, with all intervening dimensions shifted towards
			
 
				+  the hole left by src_dim. Eg:
			
 
				+  num_dims = 4, src_dim=3, dest_dim=1
			
 
				+  Returned list=[0, 3, 1, 2]
			
 
				+  For a tensor with dims=[5, 4, 3, 2] a transpose would yield [5, 2, 4, 3].
			
 
				+  Args:
			
 
				+    num_dims: The number of dimensions to handle.
			
 
				+    src_dim:  The dimension to move.
			
 
				+    dest_dim: The dimension to move src_dim to.
			
 
				+
			
 
				+  Returns:
			
 
				+    A list of rotated dimension indices.
			
 
				+  """
			
 
				+  # List of dimensions for transpose.
			
 
				+  dim_list = range(num_dims)
			
 
				+  # Shuffle src_dim to dest_dim by swapping to shuffle up the other dims.
			
 
				+  step = 1 if dest_dim > src_dim else -1
			
 
				+  for x in xrange(src_dim, dest_dim, step):
			
 
				+    dim_list[x], dim_list[x + step] = dim_list[x + step], dim_list[x]
			
 
				+  return dim_list
			
 
				+
			
 
				+
			
 
				+def transposing_reshape(tensor,
			
 
				+                        src_dim,
			
 
				+                        part_a,
			
 
				+                        part_b,
			
 
				+                        dest_dim_a,
			
 
				+                        dest_dim_b,
			
 
				+                        name=None):
			
 
				+  """Splits src_dim and sends one of the pieces to another dim.
			
 
				+
			
 
				+  Terminology:
			
 
				+  A matrix is often described as 'row-major' or 'column-major', which doesn't
			
 
				+  help if you can't remember which is the row index and which is the column,
			
 
				+  even if you know what 'major' means, so here is a simpler explanation of it:
			
 
				+  When TF stores a tensor of size [d0, d1, d2, d3] indexed by [i0, i1, i2, i3],
			
 
				+  the memory address of an element is calculated using:
			
 
				+  ((i0 * d1 + i1) * d2 + i2) * d3 + i3, so, d0 is the MOST SIGNIFICANT dimension
			
 
				+  and d3 the LEAST SIGNIFICANT, just like in the decimal number 1234, 1 is the
			
 
				+  most significant digit and 4 the least significant. In both cases the most
			
 
				+  significant is multiplied by the largest number to determine its 'value'.
			
 
				+  Furthermore, if we reshape the tensor to [d0'=d0, d1'=d1 x d2, d2'=d3], then
			
 
				+  the MOST SIGNIFICANT part of d1' is d1 and the LEAST SIGNIFICANT part of d1'
			
 
				+  is d2.
			
 
				+
			
 
				+  Action:
			
 
				+  transposing_reshape splits src_dim into factors [part_a, part_b], and sends
			
 
				+  the most significant part (of size  part_a) to be the most significant part of
			
 
				+  dest_dim_a*(Exception: see NOTE 2), and the least significant part (of size
			
 
				+  part_b) to be the most significant part of dest_dim_b.
			
 
				+  This is basically a combination of reshape, rotating transpose, reshape.
			
 
				+  NOTE1: At least one of dest_dim_a and dest_dim_b must equal src_dim, ie one of
			
 
				+  the parts always stays put, so src_dim is never totally destroyed and the
			
 
				+  output number of dimensions is always the same as the input.
			
 
				+  NOTE2: If dest_dim_a == dest_dim_b == src_dim, then parts a and b are simply
			
 
				+  transposed within src_dim to become part_b x part_a, so the most significant
			
 
				+  part becomes the least significant part and vice versa. Thus if you really
			
 
				+  wanted to make one of the parts the least significant side of the destiantion,
			
 
				+  the destination dimension can be internally transposed with a second call to
			
 
				+  transposing_reshape.
			
 
				+  NOTE3: One of part_a and part_b may be -1 to allow src_dim to be of unknown
			
 
				+  size with one known-size factor. Otherwise part_a * part_b must equal the size
			
 
				+  of src_dim.
			
 
				+  NOTE4: The reshape preserves as many known-at-graph-build-time dimension sizes
			
 
				+  as are available.
			
 
				+
			
 
				+  Example:
			
 
				+  Input dims=[5, 2, 6, 2]
			
 
				+  tensor=[[[[0, 1][2, 3][4, 5][6, 7][8, 9][10, 11]]
			
 
				+           [[12, 13][14, 15][16, 17][18, 19][20, 21][22, 23]]
			
 
				+          [[[24, 25]...
			
 
				+  src_dim=2, part_a=2, part_b=3, dest_dim_a=3, dest_dim_b=2
			
 
				+  output dims =[5, 2, 3, 4]
			
 
				+  output tensor=[[[[0, 1, 6, 7][2, 3, 8, 9][4, 5, 10, 11]]
			
 
				+                  [[12, 13, 18, 19][14, 15, 20, 21][16, 17, 22, 23]]]
			
 
				+                 [[[24, 26, 28]...
			
 
				+  Example2:
			
 
				+  Input dims=[phrases, words, letters]=[2, 6, x]
			
 
				+  tensor=[[[the][cat][sat][on][the][mat]]
			
 
				+         [[a][stitch][in][time][saves][nine]]]
			
 
				+  We can factorize the 6 words into 3x2 = [[the][cat]][[sat][on]][[the][mat]]
			
 
				+  or 2x3=[[the][cat][sat]][[on][the][mat]] and
			
 
				+  src_dim=1, part_a=3, part_b=2, dest_dim_a=1, dest_dim_b=1
			
 
				+  would yield:
			
 
				+  [[[the][sat][the][cat][on][mat]]
			
 
				+   [[a][in][saves][stitch][time][nine]]], but
			
 
				+  src_dim=1, part_a=2, part_b=3, dest_dim_a=1, dest_dim_b=1
			
 
				+  would yield:
			
 
				+  [[[the][on][cat][the][sat][mat]]
			
 
				+   [[a][time][stitch][saves][in][nine]]], and
			
 
				+  src_dim=1, part_a=2, part_b=3, dest_dim_a=0, dest_dim_b=1
			
 
				+  would yield:
			
 
				+  [[[the][cat][sat]]
			
 
				+   [[a][stitch][in]]
			
 
				+   [[on][the][mat]]
			
 
				+   [[time][saves][nine]]]
			
 
				+  Now remember that the words above represent any least-significant subset of
			
 
				+  the input dimensions.
			
 
				+
			
 
				+  Args:
			
 
				+    tensor:     A tensor to reshape.
			
 
				+    src_dim:    The dimension to split.
			
 
				+    part_a:     The first factor of the split.
			
 
				+    part_b:     The second factor of the split.
			
 
				+    dest_dim_a: The dimension to move part_a of src_dim to.
			
 
				+    dest_dim_b: The dimension to move part_b of src_dim to.
			
 
				+    name:       Optional base name for all the ops.
			
 
				+
			
 
				+  Returns:
			
 
				+    Reshaped tensor.
			
 
				+
			
 
				+  Raises:
			
 
				+    ValueError: If the args are invalid.
			
 
				+  """
			
 
				+  if dest_dim_a != src_dim and dest_dim_b != src_dim:
			
 
				+    raise ValueError(
			
 
				+        'At least one of dest_dim_a, dest_dim_b must equal src_dim!')
			
 
				+  if part_a == 0 or part_b == 0:
			
 
				+    raise ValueError('Zero not allowed for part_a or part_b!')
			
 
				+  if part_a < 0 and part_b < 0:
			
 
				+    raise ValueError('At least one of part_a and part_b must be positive!')
			
 
				+  if not name:
			
 
				+    name = 'transposing_reshape'
			
 
				+  prev_shape = tensor_shape(tensor)
			
 
				+  expanded = tf.reshape(
			
 
				+      tensor,
			
 
				+      prev_shape[:src_dim] + [part_a, part_b] + prev_shape[src_dim + 1:],
			
 
				+      name=name + '_reshape_in')
			
 
				+  dest = dest_dim_b
			
 
				+  if dest_dim_a != src_dim:
			
 
				+    # We are just moving part_a to dest_dim_a.
			
 
				+    dest = dest_dim_a
			
 
				+  else:
			
 
				+    # We are moving part_b to dest_dim_b.
			
 
				+    src_dim += 1
			
 
				+  dim_list = rotate_dimensions(len(expanded.get_shape()), src_dim, dest)
			
 
				+  expanded = tf.transpose(expanded, dim_list, name=name + '_rot_transpose')
			
 
				+  # Reshape identity except dest,dest+1, which get merged.
			
 
				+  ex_shape = tensor_shape(expanded)
			
 
				+  combined = ex_shape[dest] * ex_shape[dest + 1]
			
 
				+  return tf.reshape(
			
 
				+      expanded,
			
 
				+      ex_shape[:dest] + [combined] + ex_shape[dest + 2:],
			
 
				+      name=name + '_reshape_out')
			
 
				+
			
 
				+
			
 
				+def tensor_dim(tensor, dim):
			
 
				+  """Returns int dimension if known at a graph build time else a tensor.
			
 
				+
			
 
				+  If the size of the dim of tensor is known at graph building time, then that
			
 
				+  known value is returned, otherwise (instead of None), a Tensor that will give
			
 
				+  the size of the dimension when the graph is run. The return value will be
			
 
				+  accepted by tf.reshape in multiple (or even all) dimensions, even when the
			
 
				+  sizes are not known at graph building time, unlike -1, which can only be used
			
 
				+  in one dimension. It is a bad idea to use tf.shape all the time, as some ops
			
 
				+  demand a known (at graph build time) size. This function therefore returns
			
 
				+  the best available, most useful dimension size.
			
 
				+  Args:
			
 
				+    tensor: Input tensor.
			
 
				+    dim:    Dimension to find the size of.
			
 
				+
			
 
				+  Returns:
			
 
				+    An integer if shape is known at build time, otherwise a tensor of int32.
			
 
				+  """
			
 
				+  result = tensor.get_shape().as_list()[dim]
			
 
				+  if result is None:
			
 
				+    result = tf.shape(tensor)[dim]
			
 
				+  return result
			
 
				+
			
 
				+
			
 
				+def tensor_shape(tensor):
			
 
				+  """Returns a heterogeneous list of tensor_dim for the tensor.
			
 
				+
			
 
				+  See tensor_dim for a more detailed explanation.
			
 
				+  Args:
			
 
				+    tensor: Input tensor.
			
 
				+
			
 
				+  Returns:
			
 
				+    A heterogeneous list of integers and int32 tensors.
			
 
				+  """
			
 
				+  result = []
			
 
				+  for d in xrange(len(tensor.get_shape())):
			
 
				+    result.append(tensor_dim(tensor, d))
			
 
				+  return result
			
--- a/street/python/shapes_test.py
+++ b/street/python/shapes_test.py
@@ -0,0 +1,171 @@
 
				+# Copyright 2016 The TensorFlow Authors. All Rights Reserved.
			
 
				+#
			
 
				+# Licensed under the Apache License, Version 2.0 (the "License");
			
 
				+# you may not use this file except in compliance with the License.
			
 
				+# You may obtain a copy of the License at
			
 
				+#
			
 
				+#     http://www.apache.org/licenses/LICENSE-2.0
			
 
				+#
			
 
				+# Unless required by applicable law or agreed to in writing, software
			
 
				+# distributed under the License is distributed on an "AS IS" BASIS,
			
 
				+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
			
 
				+# See the License for the specific language governing permissions and
			
 
				+# limitations under the License.
			
 
				+# ==============================================================================
			
 
				+"""Tests for shapes."""
			
 
				+
			
 
				+import numpy as np
			
 
				+import tensorflow as tf
			
 
				+import shapes
			
 
				+
			
 
				+
			
 
				+def _rand(*size):
			
 
				+  return np.random.uniform(size=size).astype('f')
			
 
				+
			
 
				+
			
 
				+class ShapesTest(tf.test.TestCase):
			
 
				+  """Tests just the shapes from a call to transposing_reshape."""
			
 
				+
			
 
				+  def __init__(self, other):
			
 
				+    super(ShapesTest, self).__init__(other)
			
 
				+    self.batch_size = 4
			
 
				+    self.im_height = 24
			
 
				+    self.im_width = 36
			
 
				+    self.depth = 20
			
 
				+
			
 
				+  def testReshapeTile(self):
			
 
				+    """Tests that a tiled input can be reshaped to the batch dimension."""
			
 
				+    fake = tf.placeholder(
			
 
				+        tf.float32, shape=(None, None, None, self.depth), name='inputs')
			
 
				+    real = _rand(self.batch_size, self.im_height, self.im_width, self.depth)
			
 
				+    with self.test_session() as sess:
			
 
				+      outputs = shapes.transposing_reshape(
			
 
				+          fake, src_dim=2, part_a=3, part_b=-1, dest_dim_a=0, dest_dim_b=2)
			
 
				+      res_image = sess.run([outputs], feed_dict={fake: real})
			
 
				+      self.assertEqual(
			
 
				+          tuple(res_image[0].shape),
			
 
				+          (self.batch_size * 3, self.im_height, self.im_width / 3, self.depth))
			
 
				+
			
 
				+  def testReshapeDepth(self):
			
 
				+    """Tests that depth can be reshaped to the x dimension."""
			
 
				+    fake = tf.placeholder(
			
 
				+        tf.float32, shape=(None, None, None, self.depth), name='inputs')
			
 
				+    real = _rand(self.batch_size, self.im_height, self.im_width, self.depth)
			
 
				+    with self.test_session() as sess:
			
 
				+      outputs = shapes.transposing_reshape(
			
 
				+          fake, src_dim=3, part_a=4, part_b=-1, dest_dim_a=2, dest_dim_b=3)
			
 
				+      res_image = sess.run([outputs], feed_dict={fake: real})
			
 
				+      self.assertEqual(
			
 
				+          tuple(res_image[0].shape),
			
 
				+          (self.batch_size, self.im_height, self.im_width * 4, self.depth / 4))
			
 
				+
			
 
				+
			
 
				+class DataTest(tf.test.TestCase):
			
 
				+  """Tests that the data is moved correctly in a call to transposing_reshape.
			
 
				+
			
 
				+  """
			
 
				+
			
 
				+  def testTransposingReshape_2_2_3_2_1(self):
			
 
				+    """Case: dest_a == src, dest_b < src: Split with Least sig part going left.
			
 
				+    """
			
 
				+    with self.test_session() as sess:
			
 
				+      fake = tf.placeholder(
			
 
				+          tf.float32, shape=(None, None, None, 2), name='inputs')
			
 
				+      outputs = shapes.transposing_reshape(
			
 
				+          fake, src_dim=2, part_a=2, part_b=3, dest_dim_a=2, dest_dim_b=1)
			
 
				+      # Make real inputs. The tensor looks like this:
			
 
				+      # tensor=[[[[0, 1][2, 3][4, 5][6, 7][8, 9][10, 11]]
			
 
				+      #          [[12, 13][14, 15][16, 17][18, 19][20, 21][22, 23]]
			
 
				+      #         [[[24, 25]...
			
 
				+      real = np.arange(120).reshape((5, 2, 6, 2))
			
 
				+      np_array = sess.run([outputs], feed_dict={fake: real})[0]
			
 
				+      self.assertEqual(tuple(np_array.shape), (5, 6, 2, 2))
			
 
				+      self.assertAllEqual(np_array[0, :, :, :],
			
 
				+                          [[[0, 1], [6, 7]], [[12, 13], [18, 19]],
			
 
				+                           [[2, 3], [8, 9]], [[14, 15], [20, 21]],
			
 
				+                           [[4, 5], [10, 11]], [[16, 17], [22, 23]]])
			
 
				+
			
 
				+  def testTransposingReshape_2_2_3_2_3(self):
			
 
				+    """Case: dest_a == src, dest_b > src: Split with Least sig part going right.
			
 
				+    """
			
 
				+    with self.test_session() as sess:
			
 
				+      fake = tf.placeholder(
			
 
				+          tf.float32, shape=(None, None, None, 2), name='inputs')
			
 
				+      outputs = shapes.transposing_reshape(
			
 
				+          fake, src_dim=2, part_a=2, part_b=3, dest_dim_a=2, dest_dim_b=3)
			
 
				+      # Make real inputs. The tensor looks like this:
			
 
				+      # tensor=[[[[0, 1][2, 3][4, 5][6, 7][8, 9][10, 11]]
			
 
				+      #          [[12, 13][14, 15][16, 17][18, 19][20, 21][22, 23]]
			
 
				+      #         [[[24, 25]...
			
 
				+      real = np.arange(120).reshape((5, 2, 6, 2))
			
 
				+      np_array = sess.run([outputs], feed_dict={fake: real})[0]
			
 
				+      self.assertEqual(tuple(np_array.shape), (5, 2, 2, 6))
			
 
				+      self.assertAllEqual(
			
 
				+          np_array[0, :, :, :],
			
 
				+          [[[0, 1, 2, 3, 4, 5], [6, 7, 8, 9, 10, 11]],
			
 
				+           [[12, 13, 14, 15, 16, 17], [18, 19, 20, 21, 22, 23]]])
			
 
				+
			
 
				+  def testTransposingReshape_2_2_3_2_2(self):
			
 
				+    """Case: dest_a == src, dest_b == src. Transpose within dimension 2.
			
 
				+    """
			
 
				+    with self.test_session() as sess:
			
 
				+      fake = tf.placeholder(
			
 
				+          tf.float32, shape=(None, None, None, 2), name='inputs')
			
 
				+      outputs = shapes.transposing_reshape(
			
 
				+          fake, src_dim=2, part_a=2, part_b=3, dest_dim_a=2, dest_dim_b=2)
			
 
				+      # Make real inputs. The tensor looks like this:
			
 
				+      # tensor=[[[[0, 1][2, 3][4, 5][6, 7][8, 9][10, 11]]
			
 
				+      #          [[12, 13][14, 15][16, 17][18, 19][20, 21][22, 23]]
			
 
				+      #         [[[24, 25]...
			
 
				+      real = np.arange(120).reshape((5, 2, 6, 2))
			
 
				+      np_array = sess.run([outputs], feed_dict={fake: real})[0]
			
 
				+      self.assertEqual(tuple(np_array.shape), (5, 2, 6, 2))
			
 
				+      self.assertAllEqual(
			
 
				+          np_array[0, :, :, :],
			
 
				+          [[[0, 1], [6, 7], [2, 3], [8, 9], [4, 5], [10, 11]],
			
 
				+           [[12, 13], [18, 19], [14, 15], [20, 21], [16, 17], [22, 23]]])
			
 
				+
			
 
				+  def testTransposingReshape_2_2_3_1_2(self):
			
 
				+    """Case: dest_a < src, dest_b == src. Split with Most sig part going left.
			
 
				+    """
			
 
				+    with self.test_session() as sess:
			
 
				+      fake = tf.placeholder(
			
 
				+          tf.float32, shape=(None, None, None, 2), name='inputs')
			
 
				+      outputs = shapes.transposing_reshape(
			
 
				+          fake, src_dim=2, part_a=2, part_b=3, dest_dim_a=1, dest_dim_b=2)
			
 
				+      # Make real inputs. The tensor looks like this:
			
 
				+      # tensor=[[[[0, 1][2, 3][4, 5][6, 7][8, 9][10, 11]]
			
 
				+      #          [[12, 13][14, 15][16, 17][18, 19][20, 21][22, 23]]
			
 
				+      #         [[[24, 25]...
			
 
				+      real = np.arange(120).reshape((5, 2, 6, 2))
			
 
				+      np_array = sess.run([outputs], feed_dict={fake: real})[0]
			
 
				+      self.assertEqual(tuple(np_array.shape), (5, 4, 3, 2))
			
 
				+      self.assertAllEqual(np_array[0, :, :, :],
			
 
				+                          [[[0, 1], [2, 3], [4, 5]],
			
 
				+                           [[12, 13], [14, 15], [16, 17]],
			
 
				+                           [[6, 7], [8, 9], [10, 11]],
			
 
				+                           [[18, 19], [20, 21], [22, 23]]])
			
 
				+
			
 
				+  def testTransposingReshape_2_2_3_3_2(self):
			
 
				+    """Case: dest_a < src, dest_b == src. Split with Most sig part going right.
			
 
				+    """
			
 
				+    with self.test_session() as sess:
			
 
				+      fake = tf.placeholder(
			
 
				+          tf.float32, shape=(None, None, None, 2), name='inputs')
			
 
				+      outputs = shapes.transposing_reshape(
			
 
				+          fake, src_dim=2, part_a=2, part_b=3, dest_dim_a=3, dest_dim_b=2)
			
 
				+      # Make real inputs. The tensor looks like this:
			
 
				+      # tensor=[[[[0, 1][2, 3][4, 5][6, 7][8, 9][10, 11]]
			
 
				+      #          [[12, 13][14, 15][16, 17][18, 19][20, 21][22, 23]]
			
 
				+      #         [[[24, 25]...
			
 
				+      real = np.arange(120).reshape((5, 2, 6, 2))
			
 
				+      np_array = sess.run([outputs], feed_dict={fake: real})[0]
			
 
				+      self.assertEqual(tuple(np_array.shape), (5, 2, 3, 4))
			
 
				+      self.assertAllEqual(
			
 
				+          np_array[0, :, :, :],
			
 
				+          [[[0, 1, 6, 7], [2, 3, 8, 9], [4, 5, 10, 11]],
			
 
				+           [[12, 13, 18, 19], [14, 15, 20, 21], [16, 17, 22, 23]]])
			
 
				+
			
 
				+
			
 
				+if __name__ == '__main__':
			
 
				+  tf.test.main()
			
--- a/street/python/vgsl_eval.py
+++ b/street/python/vgsl_eval.py
@@ -0,0 +1,49 @@
 
				+# Copyright 2016 The TensorFlow Authors. All Rights Reserved.
			
 
				+#
			
 
				+# Licensed under the Apache License, Version 2.0 (the "License");
			
 
				+# you may not use this file except in compliance with the License.
			
 
				+# You may obtain a copy of the License at
			
 
				+#
			
 
				+#     http://www.apache.org/licenses/LICENSE-2.0
			
 
				+#
			
 
				+# Unless required by applicable law or agreed to in writing, software
			
 
				+# distributed under the License is distributed on an "AS IS" BASIS,
			
 
				+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
			
 
				+# See the License for the specific language governing permissions and
			
 
				+# limitations under the License.
			
 
				+# ==============================================================================
			
 
				+"""Model eval separate from training."""
			
 
				+from tensorflow import app
			
 
				+from tensorflow.python.platform import flags
			
 
				+
			
 
				+import vgsl_model
			
 
				+
			
 
				+flags.DEFINE_string('eval_dir', '/tmp/mdir/eval',
			
 
				+                    'Directory where to write event logs.')
			
 
				+flags.DEFINE_string('graph_def_file', None,
			
 
				+                    'Output eval graph definition file.')
			
 
				+flags.DEFINE_string('train_dir', '/tmp/mdir',
			
 
				+                    'Directory where to find training checkpoints.')
			
 
				+flags.DEFINE_string('model_str',
			
 
				+                    '1,150,600,3[S2(4x150)0,2 Ct5,5,16 Mp2,2 Ct5,5,64 Mp3,3'
			
 
				+                    '([Lrys64 Lbx128][Lbys64 Lbx128][Lfys64 Lbx128])S3(3x0)2,3'
			
 
				+                    'Lfx128 Lrx128 S0(1x4)0,3 Do Lfx256]O1c134',
			
 
				+                    'Network description.')
			
 
				+flags.DEFINE_integer('num_steps', 1000, 'Number of steps to run evaluation.')
			
 
				+flags.DEFINE_integer('eval_interval_secs', 60,
			
 
				+                     'Time interval between eval runs.')
			
 
				+flags.DEFINE_string('eval_data', None, 'Evaluation data filepattern')
			
 
				+flags.DEFINE_string('decoder', None, 'Charset decoder')
			
 
				+
			
 
				+FLAGS = flags.FLAGS
			
 
				+
			
 
				+
			
 
				+def main(argv):
			
 
				+  del argv
			
 
				+  vgsl_model.Eval(FLAGS.train_dir, FLAGS.eval_dir, FLAGS.model_str,
			
 
				+                  FLAGS.eval_data, FLAGS.decoder, FLAGS.num_steps,
			
 
				+                  FLAGS.graph_def_file, FLAGS.eval_interval_secs)
			
 
				+
			
 
				+
			
 
				+if __name__ == '__main__':
			
 
				+  app.run()
			
--- a/street/python/vgsl_input.py
+++ b/street/python/vgsl_input.py
@@ -0,0 +1,150 @@
 
				+# Copyright 2016 The TensorFlow Authors. All Rights Reserved.
			
 
				+#
			
 
				+# Licensed under the Apache License, Version 2.0 (the "License");
			
 
				+# you may not use this file except in compliance with the License.
			
 
				+# You may obtain a copy of the License at
			
 
				+#
			
 
				+#     http://www.apache.org/licenses/LICENSE-2.0
			
 
				+#
			
 
				+# Unless required by applicable law or agreed to in writing, software
			
 
				+# distributed under the License is distributed on an "AS IS" BASIS,
			
 
				+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
			
 
				+# See the License for the specific language governing permissions and
			
 
				+# limitations under the License.
			
 
				+# ==============================================================================
			
 
				+
			
 
				+"""String network description language to define network layouts."""
			
 
				+import collections
			
 
				+import tensorflow as tf
			
 
				+from tensorflow.python.ops import parsing_ops
			
 
				+
			
 
				+# Named tuple for the standard tf image tensor Shape.
			
 
				+# batch_size:     Number of images to batch-up for training.
			
 
				+# height:         Fixed height of image or None for variable.
			
 
				+# width:          Fixed width of image or None for variable.
			
 
				+# depth:          Desired depth in bytes per pixel of input images.
			
 
				+ImageShape = collections.namedtuple('ImageTensorDims',
			
 
				+                                    ['batch_size', 'height', 'width', 'depth'])
			
 
				+
			
 
				+
			
 
				+def ImageInput(input_pattern, num_threads, shape, using_ctc, reader=None):
			
 
				+  """Creates an input image tensor from the input_pattern filenames.
			
 
				+
			
 
				+  TODO(rays) Expand for 2-d labels, 0-d labels, and logistic targets.
			
 
				+  Args:
			
 
				+    input_pattern:  Filenames of the dataset(s) to read.
			
 
				+    num_threads:    Number of preprocessing threads.
			
 
				+    shape:          ImageShape with the desired shape of the input.
			
 
				+    using_ctc:      Take the unpadded_class labels instead of padded.
			
 
				+    reader:         Function that returns an actual reader to read Examples from
			
 
				+      input files. If None, uses tf.TFRecordReader().
			
 
				+  Returns:
			
 
				+    images:   Float Tensor containing the input image scaled to [-1.28, 1.27].
			
 
				+    heights:  Tensor int64 containing the heights of the images.
			
 
				+    widths:   Tensor int64 containing the widths of the images.
			
 
				+    labels:   Serialized SparseTensor containing the int64 labels.
			
 
				+    sparse_labels:   Serialized SparseTensor containing the int64 labels.
			
 
				+    truths:   Tensor string of the utf8 truth texts.
			
 
				+  Raises:
			
 
				+    ValueError: if the optimizer type is unrecognized.
			
 
				+  """
			
 
				+  data_files = tf.gfile.Glob(input_pattern)
			
 
				+  assert data_files, 'no files found for dataset ' + input_pattern
			
 
				+  queue_capacity = shape.batch_size * num_threads * 2
			
 
				+  filename_queue = tf.train.string_input_producer(
			
 
				+      data_files, capacity=queue_capacity)
			
 
				+
			
 
				+  # Create a subgraph with its own reader (but sharing the
			
 
				+  # filename_queue) for each preprocessing thread.
			
 
				+  images_and_label_lists = []
			
 
				+  for _ in range(num_threads):
			
 
				+    image, height, width, labels, text = _ReadExamples(filename_queue, shape,
			
 
				+                                                       using_ctc, reader)
			
 
				+    images_and_label_lists.append([image, height, width, labels, text])
			
 
				+  # Create a queue that produces the examples in batches.
			
 
				+  images, heights, widths, labels, truths = tf.train.batch_join(
			
 
				+      images_and_label_lists,
			
 
				+      batch_size=shape.batch_size,
			
 
				+      capacity=16 * shape.batch_size,
			
 
				+      dynamic_pad=True)
			
 
				+  # Deserialize back to sparse, because the batcher doesn't do sparse.
			
 
				+  labels = tf.deserialize_many_sparse(labels, tf.int64)
			
 
				+  sparse_labels = tf.cast(labels, tf.int32)
			
 
				+  labels = tf.sparse_tensor_to_dense(labels)
			
 
				+  labels = tf.reshape(labels, [shape.batch_size, -1], name='Labels')
			
 
				+  # Crush the other shapes to just the batch dimension.
			
 
				+  heights = tf.reshape(heights, [-1], name='Heights')
			
 
				+  widths = tf.reshape(widths, [-1], name='Widths')
			
 
				+  truths = tf.reshape(truths, [-1], name='Truths')
			
 
				+  # Give the images a nice name as well.
			
 
				+  images = tf.identity(images, name='Images')
			
 
				+
			
 
				+  tf.image_summary('Images', images)
			
 
				+  return images, heights, widths, labels, sparse_labels, truths
			
 
				+
			
 
				+
			
 
				+def _ReadExamples(filename_queue, shape, using_ctc, reader=None):
			
 
				+  """Builds network input tensor ops for TF Example.
			
 
				+
			
 
				+  Args:
			
 
				+    filename_queue: Queue of filenames, from tf.train.string_input_producer
			
 
				+    shape:          ImageShape with the desired shape of the input.
			
 
				+    using_ctc:      Take the unpadded_class labels instead of padded.
			
 
				+    reader:         Function that returns an actual reader to read Examples from
			
 
				+      input files. If None, uses tf.TFRecordReader().
			
 
				+  Returns:
			
 
				+    image:   Float Tensor containing the input image scaled to [-1.28, 1.27].
			
 
				+    height:  Tensor int64 containing the height of the image.
			
 
				+    width:   Tensor int64 containing the width of the image.
			
 
				+    labels:  Serialized SparseTensor containing the int64 labels.
			
 
				+    text:    Tensor string of the utf8 truth text.
			
 
				+  """
			
 
				+  if reader:
			
 
				+    reader = reader()
			
 
				+  else:
			
 
				+    reader = tf.TFRecordReader()
			
 
				+  _, example_serialized = reader.read(filename_queue)
			
 
				+  example_serialized = tf.reshape(example_serialized, shape=[])
			
 
				+  features = tf.parse_single_example(
			
 
				+      example_serialized,
			
 
				+      {'image/encoded': parsing_ops.FixedLenFeature(
			
 
				+          [1], dtype=tf.string, default_value=''),
			
 
				+       'image/text': parsing_ops.FixedLenFeature(
			
 
				+           [1], dtype=tf.string, default_value=''),
			
 
				+       'image/class': parsing_ops.VarLenFeature(dtype=tf.int64),
			
 
				+       'image/unpadded_class': parsing_ops.VarLenFeature(dtype=tf.int64),
			
 
				+       'image/height': parsing_ops.FixedLenFeature(
			
 
				+           [1], dtype=tf.int64, default_value=1),
			
 
				+       'image/width': parsing_ops.FixedLenFeature(
			
 
				+           [1], dtype=tf.int64, default_value=1)})
			
 
				+  if using_ctc:
			
 
				+    labels = features['image/unpadded_class']
			
 
				+  else:
			
 
				+    labels = features['image/class']
			
 
				+  labels = tf.serialize_sparse(labels)
			
 
				+  image = tf.reshape(features['image/encoded'], shape=[], name='encoded')
			
 
				+  image = _ImageProcessing(image, shape)
			
 
				+  height = tf.reshape(features['image/height'], [-1])
			
 
				+  width = tf.reshape(features['image/width'], [-1])
			
 
				+  text = tf.reshape(features['image/text'], shape=[])
			
 
				+
			
 
				+  return image, height, width, labels, text
			
 
				+
			
 
				+
			
 
				+def _ImageProcessing(image_buffer, shape):
			
 
				+  """Convert a PNG string into an input tensor.
			
 
				+
			
 
				+  We allow for fixed and variable sizes.
			
 
				+  Does fixed conversion to floats in the range [-1.28, 1.27].
			
 
				+  Args:
			
 
				+    image_buffer: Tensor containing a PNG encoded image.
			
 
				+    shape:          ImageShape with the desired shape of the input.
			
 
				+  Returns:
			
 
				+    image:        Decoded, normalized image in the range [-1.28, 1.27].
			
 
				+  """
			
 
				+  image = tf.image.decode_png(image_buffer, channels=shape.depth)
			
 
				+  image.set_shape([shape.height, shape.width, shape.depth])
			
 
				+  image = tf.cast(image, tf.float32)
			
 
				+  image = tf.sub(image, 128.0)
			
 
				+  image = tf.mul(image, 1 / 100.0)
			
 
				+  return image
			
--- a/street/python/vgsl_model.py
+++ b/street/python/vgsl_model.py
@@ -0,0 +1,599 @@
 
				+# Copyright 2016 The TensorFlow Authors. All Rights Reserved.
			
 
				+#
			
 
				+# Licensed under the Apache License, Version 2.0 (the "License");
			
 
				+# you may not use this file except in compliance with the License.
			
 
				+# You may obtain a copy of the License at
			
 
				+#
			
 
				+#     http://www.apache.org/licenses/LICENSE-2.0
			
 
				+#
			
 
				+# Unless required by applicable law or agreed to in writing, software
			
 
				+# distributed under the License is distributed on an "AS IS" BASIS,
			
 
				+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
			
 
				+# See the License for the specific language governing permissions and
			
 
				+# limitations under the License.
			
 
				+# ==============================================================================
			
 
				+
			
 
				+"""String network description language to define network layouts."""
			
 
				+import re
			
 
				+import time
			
 
				+
			
 
				+import decoder
			
 
				+import errorcounter as ec
			
 
				+import shapes
			
 
				+import tensorflow as tf
			
 
				+import vgsl_input
			
 
				+import vgslspecs
			
 
				+import tensorflow.contrib.slim as slim
			
 
				+from tensorflow.core.framework import summary_pb2
			
 
				+from tensorflow.python.platform import tf_logging as logging
			
 
				+
			
 
				+
			
 
				+# Parameters for rate decay.
			
 
				+# We divide the learning_rate_halflife by DECAY_STEPS_FACTOR and use DECAY_RATE
			
 
				+# as the decay factor for the learning rate, ie we use the DECAY_STEPS_FACTORth
			
 
				+# root of 2 as the decay rate every halflife/DECAY_STEPS_FACTOR to achieve the
			
 
				+# desired halflife.
			
 
				+DECAY_STEPS_FACTOR = 16
			
 
				+DECAY_RATE = pow(0.5, 1.0 / DECAY_STEPS_FACTOR)
			
 
				+
			
 
				+
			
 
				+def Train(train_dir,
			
 
				+          model_str,
			
 
				+          train_data,
			
 
				+          max_steps,
			
 
				+          master='',
			
 
				+          task=0,
			
 
				+          ps_tasks=0,
			
 
				+          initial_learning_rate=0.001,
			
 
				+          final_learning_rate=0.001,
			
 
				+          learning_rate_halflife=160000,
			
 
				+          optimizer_type='Adam',
			
 
				+          num_preprocess_threads=1,
			
 
				+          reader=None):
			
 
				+  """Testable trainer with no dependence on FLAGS.
			
 
				+
			
 
				+  Args:
			
 
				+    train_dir: Directory to write checkpoints.
			
 
				+    model_str: Network specification string.
			
 
				+    train_data: Training data file pattern.
			
 
				+    max_steps: Number of training steps to run.
			
 
				+    master: Name of the TensorFlow master to use.
			
 
				+    task: Task id of this replica running the training. (0 will be master).
			
 
				+    ps_tasks: Number of tasks in ps job, or 0 if no ps job.
			
 
				+    initial_learning_rate: Learing rate at start of training.
			
 
				+    final_learning_rate: Asymptotic minimum learning rate.
			
 
				+    learning_rate_halflife: Number of steps over which to halve the difference
			
 
				+      between initial and final learning rate.
			
 
				+    optimizer_type: One of 'GradientDescent', 'AdaGrad', 'Momentum', 'Adam'.
			
 
				+    num_preprocess_threads: Number of input threads.
			
 
				+    reader: Function that returns an actual reader to read Examples from input
			
 
				+      files. If None, uses tf.TFRecordReader().
			
 
				+  """
			
 
				+  if master.startswith('local'):
			
 
				+    device = tf.ReplicaDeviceSetter(ps_tasks)
			
 
				+  else:
			
 
				+    device = '/cpu:0'
			
 
				+  with tf.Graph().as_default():
			
 
				+    with tf.device(device):
			
 
				+      model = InitNetwork(train_data, model_str, 'train', initial_learning_rate,
			
 
				+                          final_learning_rate, learning_rate_halflife,
			
 
				+                          optimizer_type, num_preprocess_threads, reader)
			
 
				+
			
 
				+      # Create a Supervisor.  It will take care of initialization, summaries,
			
 
				+      # checkpoints, and recovery.
			
 
				+      #
			
 
				+      # When multiple replicas of this program are running, the first one,
			
 
				+      # identified by --task=0 is the 'chief' supervisor.  It is the only one
			
 
				+      # that takes case of initialization, etc.
			
 
				+      sv = tf.train.Supervisor(
			
 
				+          logdir=train_dir,
			
 
				+          is_chief=(task == 0),
			
 
				+          saver=model.saver,
			
 
				+          save_summaries_secs=10,
			
 
				+          save_model_secs=30,
			
 
				+          recovery_wait_secs=5)
			
 
				+
			
 
				+      step = 0
			
 
				+      while step < max_steps:
			
 
				+        try:
			
 
				+          # Get an initialized, and possibly recovered session.  Launch the
			
 
				+          # services: Checkpointing, Summaries, step counting.
			
 
				+          with sv.managed_session(master) as sess:
			
 
				+            while step < max_steps:
			
 
				+              _, step = model.TrainAStep(sess)
			
 
				+              if sv.coord.should_stop():
			
 
				+                break
			
 
				+        except tf.errors.AbortedError as e:
			
 
				+          logging.error('Received error:%s', e)
			
 
				+          continue
			
 
				+
			
 
				+
			
 
				+def Eval(train_dir,
			
 
				+         eval_dir,
			
 
				+         model_str,
			
 
				+         eval_data,
			
 
				+         decoder_file,
			
 
				+         num_steps,
			
 
				+         graph_def_file=None,
			
 
				+         eval_interval_secs=0,
			
 
				+         reader=None):
			
 
				+  """Restores a model from a checkpoint and evaluates it.
			
 
				+
			
 
				+  Args:
			
 
				+    train_dir: Directory to find checkpoints.
			
 
				+    eval_dir: Directory to write summary events.
			
 
				+    model_str: Network specification string.
			
 
				+    eval_data: Evaluation data file pattern.
			
 
				+    decoder_file: File to read to decode the labels.
			
 
				+    num_steps: Number of eval steps to run.
			
 
				+    graph_def_file: File to write graph definition to for freezing.
			
 
				+    eval_interval_secs: How often to run evaluations, or once if 0.
			
 
				+    reader: Function that returns an actual reader to read Examples from input
			
 
				+      files. If None, uses tf.TFRecordReader().
			
 
				+  Returns:
			
 
				+    (char error rate, word recall error rate, sequence error rate) as percent.
			
 
				+  Raises:
			
 
				+    ValueError: If unimplemented feature is used.
			
 
				+  """
			
 
				+  decode = None
			
 
				+  if decoder_file:
			
 
				+    decode = decoder.Decoder(decoder_file)
			
 
				+
			
 
				+  # Run eval.
			
 
				+  rates = ec.ErrorRates(
			
 
				+      label_error=None,
			
 
				+      word_recall_error=None,
			
 
				+      word_precision_error=None,
			
 
				+      sequence_error=None)
			
 
				+  with tf.Graph().as_default():
			
 
				+    model = InitNetwork(eval_data, model_str, 'eval', reader=reader)
			
 
				+    sw = tf.train.SummaryWriter(eval_dir)
			
 
				+
			
 
				+    while True:
			
 
				+      sess = tf.Session('')
			
 
				+      if graph_def_file is not None:
			
 
				+        # Write the eval version of the graph to a file for freezing.
			
 
				+        if not tf.gfile.Exists(graph_def_file):
			
 
				+          with tf.gfile.FastGFile(graph_def_file, 'w') as f:
			
 
				+            f.write(
			
 
				+                sess.graph.as_graph_def(add_shapes=True).SerializeToString())
			
 
				+      ckpt = tf.train.get_checkpoint_state(train_dir)
			
 
				+      if ckpt and ckpt.model_checkpoint_path:
			
 
				+        step = model.Restore(ckpt.model_checkpoint_path, sess)
			
 
				+        if decode:
			
 
				+          rates = decode.SoftmaxEval(sess, model, num_steps)
			
 
				+          _AddRateToSummary('Label error rate', rates.label_error, step, sw)
			
 
				+          _AddRateToSummary('Word recall error rate', rates.word_recall_error,
			
 
				+                            step, sw)
			
 
				+          _AddRateToSummary('Word precision error rate',
			
 
				+                            rates.word_precision_error, step, sw)
			
 
				+          _AddRateToSummary('Sequence error rate', rates.sequence_error, step,
			
 
				+                            sw)
			
 
				+          sw.flush()
			
 
				+          print 'Error rates=', rates
			
 
				+        else:
			
 
				+          raise ValueError('Non-softmax decoder evaluation not implemented!')
			
 
				+      if eval_interval_secs:
			
 
				+        time.sleep(eval_interval_secs)
			
 
				+      else:
			
 
				+        break
			
 
				+  return rates
			
 
				+
			
 
				+
			
 
				+def InitNetwork(input_pattern,
			
 
				+                model_spec,
			
 
				+                mode='eval',
			
 
				+                initial_learning_rate=0.00005,
			
 
				+                final_learning_rate=0.00005,
			
 
				+                halflife=1600000,
			
 
				+                optimizer_type='Adam',
			
 
				+                num_preprocess_threads=1,
			
 
				+                reader=None):
			
 
				+  """Constructs a python tensor flow model defined by model_spec.
			
 
				+
			
 
				+  Args:
			
 
				+    input_pattern: File pattern of the data in tfrecords of Example.
			
 
				+    model_spec: Concatenation of input spec, model spec and output spec.
			
 
				+      See Build below for input/output spec. For model spec, see vgslspecs.py
			
 
				+    mode: One of 'train', 'eval'
			
 
				+    initial_learning_rate: Initial learning rate for the network.
			
 
				+    final_learning_rate: Final learning rate for the network.
			
 
				+    halflife: Number of steps over which to halve the difference between
			
 
				+              initial and final learning rate for the network.
			
 
				+    optimizer_type: One of 'GradientDescent', 'AdaGrad', 'Momentum', 'Adam'.
			
 
				+    num_preprocess_threads: Number of threads to use for image processing.
			
 
				+    reader: Function that returns an actual reader to read Examples from input
			
 
				+      files. If None, uses tf.TFRecordReader().
			
 
				+    Eval tasks need only specify input_pattern and model_spec.
			
 
				+
			
 
				+  Returns:
			
 
				+    A VGSLImageModel class.
			
 
				+
			
 
				+  Raises:
			
 
				+    ValueError: if the model spec syntax is incorrect.
			
 
				+  """
			
 
				+  model = VGSLImageModel(mode, model_spec, initial_learning_rate,
			
 
				+                         final_learning_rate, halflife)
			
 
				+  left_bracket = model_spec.find('[')
			
 
				+  right_bracket = model_spec.rfind(']')
			
 
				+  if left_bracket < 0 or right_bracket < 0:
			
 
				+    raise ValueError('Failed to find [] in model spec! ', model_spec)
			
 
				+  input_spec = model_spec[:left_bracket]
			
 
				+  layer_spec = model_spec[left_bracket:right_bracket + 1]
			
 
				+  output_spec = model_spec[right_bracket + 1:]
			
 
				+  model.Build(input_pattern, input_spec, layer_spec, output_spec,
			
 
				+              optimizer_type, num_preprocess_threads, reader)
			
 
				+  return model
			
 
				+
			
 
				+
			
 
				+class VGSLImageModel(object):
			
 
				+  """Class that builds a tensor flow model for training or evaluation.
			
 
				+  """
			
 
				+
			
 
				+  def __init__(self, mode, model_spec, initial_learning_rate,
			
 
				+               final_learning_rate, halflife):
			
 
				+    """Constructs a VGSLImageModel.
			
 
				+
			
 
				+    Args:
			
 
				+      mode:        One of "train", "eval"
			
 
				+      model_spec:  Full model specification string, for reference only.
			
 
				+      initial_learning_rate: Initial learning rate for the network.
			
 
				+      final_learning_rate: Final learning rate for the network.
			
 
				+      halflife: Number of steps over which to halve the difference between
			
 
				+                initial and final learning rate for the network.
			
 
				+    """
			
 
				+    # The string that was used to build this model.
			
 
				+    self.model_spec = model_spec
			
 
				+    # The layers between input and output.
			
 
				+    self.layers = None
			
 
				+    # The train/eval mode.
			
 
				+    self.mode = mode
			
 
				+    # The initial learning rate.
			
 
				+    self.initial_learning_rate = initial_learning_rate
			
 
				+    self.final_learning_rate = final_learning_rate
			
 
				+    self.decay_steps = halflife / DECAY_STEPS_FACTOR
			
 
				+    self.decay_rate = DECAY_RATE
			
 
				+    # Tensor for the labels.
			
 
				+    self.labels = None
			
 
				+    self.sparse_labels = None
			
 
				+    # Debug data containing the truth text.
			
 
				+    self.truths = None
			
 
				+    # Tensor for loss
			
 
				+    self.loss = None
			
 
				+    # Train operation
			
 
				+    self.train_op = None
			
 
				+    # Tensor for the global step counter
			
 
				+    self.global_step = None
			
 
				+    # Tensor for the output predictions (usually softmax)
			
 
				+    self.output = None
			
 
				+    # True if we are using CTC training mode.
			
 
				+    self.using_ctc = False
			
 
				+    # Saver object to load or restore the variables.
			
 
				+    self.saver = None
			
 
				+
			
 
				+  def Build(self, input_pattern, input_spec, model_spec, output_spec,
			
 
				+            optimizer_type, num_preprocess_threads, reader):
			
 
				+    """Builds the model from the separate input/layers/output spec strings.
			
 
				+
			
 
				+    Args:
			
 
				+      input_pattern: File pattern of the data in tfrecords of TF Example format.
			
 
				+      input_spec: Specification of the input layer:
			
 
				+        batchsize,height,width,depth (4 comma-separated integers)
			
 
				+          Training will run with batches of batchsize images, but runtime can
			
 
				+          use any batch size.
			
 
				+          height and/or width can be 0 or -1, indicating variable size,
			
 
				+          otherwise all images must be the given size.
			
 
				+          depth must be 1 or 3 to indicate greyscale or color.
			
 
				+          NOTE 1-d image input, treating the y image dimension as depth, can
			
 
				+          be achieved using S1(1x0)1,3 as the first op in the model_spec, but
			
 
				+          the y-size of the input must then be fixed.
			
 
				+      model_spec: Model definition. See vgslspecs.py
			
 
				+      output_spec: Output layer definition:
			
 
				+        O(2|1|0)(l|s|c)n output layer with n classes.
			
 
				+          2 (heatmap) Output is a 2-d vector map of the input (possibly at
			
 
				+            different scale).
			
 
				+          1 (sequence) Output is a 1-d sequence of vector values.
			
 
				+          0 (value) Output is a 0-d single vector value.
			
 
				+          l uses a logistic non-linearity on the output, allowing multiple
			
 
				+            hot elements in any output vector value.
			
 
				+          s uses a softmax non-linearity, with one-hot output in each value.
			
 
				+          c uses a softmax with CTC. Can only be used with s (sequence).
			
 
				+          NOTE Only O1s and O1c are currently supported.
			
 
				+      optimizer_type: One of 'GradientDescent', 'AdaGrad', 'Momentum', 'Adam'.
			
 
				+      num_preprocess_threads: Number of threads to use for image processing.
			
 
				+      reader: Function that returns an actual reader to read Examples from input
			
 
				+        files. If None, uses tf.TFRecordReader().
			
 
				+    """
			
 
				+    self.global_step = tf.Variable(0, name='global_step', trainable=False)
			
 
				+    shape = _ParseInputSpec(input_spec)
			
 
				+    out_dims, out_func, num_classes = _ParseOutputSpec(output_spec)
			
 
				+    self.using_ctc = out_func == 'c'
			
 
				+    images, heights, widths, labels, sparse, _ = vgsl_input.ImageInput(
			
 
				+        input_pattern, num_preprocess_threads, shape, self.using_ctc, reader)
			
 
				+    self.labels = labels
			
 
				+    self.sparse_labels = sparse
			
 
				+    self.layers = vgslspecs.VGSLSpecs(widths, heights, self.mode == 'train')
			
 
				+    last_layer = self.layers.Build(images, model_spec)
			
 
				+    self._AddOutputs(last_layer, out_dims, out_func, num_classes)
			
 
				+    if self.mode == 'train':
			
 
				+      self._AddOptimizer(optimizer_type)
			
 
				+
			
 
				+    # For saving the model across training and evaluation
			
 
				+    self.saver = tf.train.Saver()
			
 
				+
			
 
				+  def TrainAStep(self, sess):
			
 
				+    """Runs a training step in the session.
			
 
				+
			
 
				+    Args:
			
 
				+      sess: Session in which to train the model.
			
 
				+    Returns:
			
 
				+      loss, global_step.
			
 
				+    """
			
 
				+    _, loss, step = sess.run([self.train_op, self.loss, self.global_step])
			
 
				+    return loss, step
			
 
				+
			
 
				+  def Restore(self, checkpoint_path, sess):
			
 
				+    """Restores the model from the given checkpoint path into the session.
			
 
				+
			
 
				+    Args:
			
 
				+      checkpoint_path: File pathname of the checkpoint.
			
 
				+      sess:            Session in which to restore the model.
			
 
				+    Returns:
			
 
				+      global_step of the model.
			
 
				+    """
			
 
				+    self.saver.restore(sess, checkpoint_path)
			
 
				+    return tf.train.global_step(sess, self.global_step)
			
 
				+
			
 
				+  def RunAStep(self, sess):
			
 
				+    """Runs a step for eval in the session.
			
 
				+
			
 
				+    Args:
			
 
				+      sess:            Session in which to run the model.
			
 
				+    Returns:
			
 
				+      output tensor result, labels tensor result.
			
 
				+    """
			
 
				+    return sess.run([self.output, self.labels])
			
 
				+
			
 
				+  def _AddOutputs(self, prev_layer, out_dims, out_func, num_classes):
			
 
				+    """Adds the output layer and loss function.
			
 
				+
			
 
				+    Args:
			
 
				+      prev_layer:  Output of last layer of main network.
			
 
				+      out_dims:    Number of output dimensions, 0, 1 or 2.
			
 
				+      out_func:    Output non-linearity. 's' or 'c'=softmax, 'l'=logistic.
			
 
				+      num_classes: Number of outputs/size of last output dimension.
			
 
				+    """
			
 
				+    height_in = shapes.tensor_dim(prev_layer, dim=1)
			
 
				+    logits, outputs = self._AddOutputLayer(prev_layer, out_dims, out_func,
			
 
				+                                           num_classes)
			
 
				+    if self.mode == 'train':
			
 
				+      # Setup loss for training.
			
 
				+      self.loss = self._AddLossFunction(logits, height_in, out_dims, out_func)
			
 
				+      tf.scalar_summary('loss', self.loss, name='loss')
			
 
				+    elif out_dims == 0:
			
 
				+      # Be sure the labels match the output, even in eval mode.
			
 
				+      self.labels = tf.slice(self.labels, [0, 0], [-1, 1])
			
 
				+      self.labels = tf.reshape(self.labels, [-1])
			
 
				+
			
 
				+    logging.info('Final output=%s', outputs)
			
 
				+    logging.info('Labels tensor=%s', self.labels)
			
 
				+    self.output = outputs
			
 
				+
			
 
				+  def _AddOutputLayer(self, prev_layer, out_dims, out_func, num_classes):
			
 
				+    """Add the fully-connected logits and SoftMax/Logistic output Layer.
			
 
				+
			
 
				+    Args:
			
 
				+      prev_layer:  Output of last layer of main network.
			
 
				+      out_dims:    Number of output dimensions, 0, 1 or 2.
			
 
				+      out_func:    Output non-linearity. 's' or 'c'=softmax, 'l'=logistic.
			
 
				+      num_classes: Number of outputs/size of last output dimension.
			
 
				+
			
 
				+    Returns:
			
 
				+      logits:  Pre-softmax/logistic fully-connected output shaped to out_dims.
			
 
				+      outputs: Post-softmax/logistic shaped to out_dims.
			
 
				+
			
 
				+    Raises:
			
 
				+      ValueError: if syntax is incorrect.
			
 
				+    """
			
 
				+    # Reduce dimensionality appropriate to the output dimensions.
			
 
				+    batch_in = shapes.tensor_dim(prev_layer, dim=0)
			
 
				+    height_in = shapes.tensor_dim(prev_layer, dim=1)
			
 
				+    width_in = shapes.tensor_dim(prev_layer, dim=2)
			
 
				+    depth_in = shapes.tensor_dim(prev_layer, dim=3)
			
 
				+    if out_dims:
			
 
				+      # Combine any remaining height and width with batch and unpack after.
			
 
				+      shaped = tf.reshape(prev_layer, [-1, depth_in])
			
 
				+    else:
			
 
				+      # Everything except batch goes to depth, and therefore has to be known.
			
 
				+      shaped = tf.reshape(prev_layer, [-1, height_in * width_in * depth_in])
			
 
				+    logits = slim.fully_connected(shaped, num_classes, activation_fn=None)
			
 
				+    if out_func == 'l':
			
 
				+      raise ValueError('Logistic not yet supported!')
			
 
				+    else:
			
 
				+      output = tf.nn.softmax(logits)
			
 
				+    # Reshape to the dessired output.
			
 
				+    if out_dims == 2:
			
 
				+      output_shape = [batch_in, height_in, width_in, num_classes]
			
 
				+    elif out_dims == 1:
			
 
				+      output_shape = [batch_in, height_in * width_in, num_classes]
			
 
				+    else:
			
 
				+      output_shape = [batch_in, num_classes]
			
 
				+    output = tf.reshape(output, output_shape, name='Output')
			
 
				+    logits = tf.reshape(logits, output_shape)
			
 
				+    return logits, output
			
 
				+
			
 
				+  def _AddLossFunction(self, logits, height_in, out_dims, out_func):
			
 
				+    """Add the appropriate loss function.
			
 
				+
			
 
				+    Args:
			
 
				+      logits:  Pre-softmax/logistic fully-connected output shaped to out_dims.
			
 
				+      height_in:  Height of logits before going into the softmax layer.
			
 
				+      out_dims:   Number of output dimensions, 0, 1 or 2.
			
 
				+      out_func:   Output non-linearity. 's' or 'c'=softmax, 'l'=logistic.
			
 
				+
			
 
				+    Returns:
			
 
				+      loss: That which is to be minimized.
			
 
				+
			
 
				+    Raises:
			
 
				+      ValueError: if logistic is used.
			
 
				+    """
			
 
				+    if out_func == 'c':
			
 
				+      # Transpose batch to the middle.
			
 
				+      ctc_input = tf.transpose(logits, [1, 0, 2])
			
 
				+      # Compute the widths of each batch element from the input widths.
			
 
				+      widths = self.layers.GetLengths(dim=2, factor=height_in)
			
 
				+      cross_entropy = tf.nn.ctc_loss(ctc_input, self.sparse_labels, widths)
			
 
				+    elif out_func == 's':
			
 
				+      if out_dims == 2:
			
 
				+        self.labels = _PadLabels3d(logits, self.labels)
			
 
				+      elif out_dims == 1:
			
 
				+        self.labels = _PadLabels2d(
			
 
				+            shapes.tensor_dim(
			
 
				+                logits, dim=1), self.labels)
			
 
				+      else:
			
 
				+        self.labels = tf.slice(self.labels, [0, 0], [-1, 1])
			
 
				+        self.labels = tf.reshape(self.labels, [-1])
			
 
				+      cross_entropy = tf.nn.sparse_softmax_cross_entropy_with_logits(
			
 
				+          logits, self.labels, name='xent')
			
 
				+    else:
			
 
				+      # TODO(rays) Labels need an extra dimension for logistic, so different
			
 
				+      # padding functions are needed, as well as a different loss function.
			
 
				+      raise ValueError('Logistic not yet supported!')
			
 
				+    return tf.reduce_sum(cross_entropy)
			
 
				+
			
 
				+  def _AddOptimizer(self, optimizer_type):
			
 
				+    """Adds an optimizer with learning rate decay to minimize self.loss.
			
 
				+
			
 
				+    Args:
			
 
				+      optimizer_type: One of 'GradientDescent', 'AdaGrad', 'Momentum', 'Adam'.
			
 
				+    Raises:
			
 
				+      ValueError: if the optimizer type is unrecognized.
			
 
				+    """
			
 
				+    learn_rate_delta = self.initial_learning_rate - self.final_learning_rate
			
 
				+    learn_rate_dec = tf.add(
			
 
				+        tf.train.exponential_decay(learn_rate_delta, self.global_step,
			
 
				+                                   self.decay_steps, self.decay_rate),
			
 
				+        self.final_learning_rate)
			
 
				+    if optimizer_type == 'GradientDescent':
			
 
				+      opt = tf.train.GradientDescentOptimizer(learn_rate_dec)
			
 
				+    elif optimizer_type == 'AdaGrad':
			
 
				+      opt = tf.train.AdagradOptimizer(learn_rate_dec)
			
 
				+    elif optimizer_type == 'Momentum':
			
 
				+      opt = tf.train.MomentumOptimizer(learn_rate_dec, momentum=0.9)
			
 
				+    elif optimizer_type == 'Adam':
			
 
				+      opt = tf.train.AdamOptimizer(learning_rate=learn_rate_dec)
			
 
				+    else:
			
 
				+      raise ValueError('Invalid optimizer type: ' + optimizer_type)
			
 
				+    tf.scalar_summary('learn_rate', learn_rate_dec, name='lr_summ')
			
 
				+
			
 
				+    self.train_op = opt.minimize(
			
 
				+        self.loss, global_step=self.global_step, name='train')
			
 
				+
			
 
				+
			
 
				+def _PadLabels3d(logits, labels):
			
 
				+  """Pads or slices 3-d labels to match logits.
			
 
				+
			
 
				+  Covers the case of 2-d softmax output, when labels is [batch, height, width]
			
 
				+  and logits is [batch, height, width, onehot]
			
 
				+  Args:
			
 
				+    logits: 4-d Pre-softmax fully-connected output.
			
 
				+    labels: 3-d, but not necessarily matching in size.
			
 
				+
			
 
				+  Returns:
			
 
				+    labels: Resized by padding or clipping to match logits.
			
 
				+  """
			
 
				+  logits_shape = shapes.tensor_shape(logits)
			
 
				+  labels_shape = shapes.tensor_shape(labels)
			
 
				+  labels = tf.reshape(labels, [-1, labels_shape[2]])
			
 
				+  labels = _PadLabels2d(logits_shape[2], labels)
			
 
				+  labels = tf.reshape(labels, [labels_shape[0], -1])
			
 
				+  labels = _PadLabels2d(logits_shape[1] * logits_shape[2], labels)
			
 
				+  return tf.reshape(labels, [labels_shape[0], logits_shape[1], logits_shape[2]])
			
 
				+
			
 
				+
			
 
				+def _PadLabels2d(logits_size, labels):
			
 
				+  """Pads or slices the 2nd dimension of 2-d labels to match logits_size.
			
 
				+
			
 
				+  Covers the case of 1-d softmax output, when labels is [batch, seq] and
			
 
				+  logits is [batch, seq, onehot]
			
 
				+  Args:
			
 
				+    logits_size: Tensor returned from tf.shape giving the target size.
			
 
				+    labels:      2-d, but not necessarily matching in size.
			
 
				+
			
 
				+  Returns:
			
 
				+    labels: Resized by padding or clipping the last dimension to logits_size.
			
 
				+  """
			
 
				+  pad = logits_size - tf.shape(labels)[1]
			
 
				+
			
 
				+  def _PadFn():
			
 
				+    return tf.pad(labels, [[0, 0], [0, pad]])
			
 
				+
			
 
				+  def _SliceFn():
			
 
				+    return tf.slice(labels, [0, 0], [-1, logits_size])
			
 
				+
			
 
				+  return tf.cond(tf.greater(pad, 0), _PadFn, _SliceFn)
			
 
				+
			
 
				+
			
 
				+def _ParseInputSpec(input_spec):
			
 
				+  """Parses input_spec and returns the numbers obtained therefrom.
			
 
				+
			
 
				+  Args:
			
 
				+    input_spec:  Specification of the input layer. See Build.
			
 
				+
			
 
				+  Returns:
			
 
				+    shape:      ImageShape with the desired shape of the input.
			
 
				+
			
 
				+  Raises:
			
 
				+    ValueError: if syntax is incorrect.
			
 
				+  """
			
 
				+  pattern = re.compile(R'(\d+),(\d+),(\d+),(\d+)')
			
 
				+  m = pattern.match(input_spec)
			
 
				+  if m is None:
			
 
				+    raise ValueError('Failed to parse input spec:' + input_spec)
			
 
				+  batch_size = int(m.group(1))
			
 
				+  y_size = int(m.group(2)) if int(m.group(2)) > 0 else None
			
 
				+  x_size = int(m.group(3)) if int(m.group(3)) > 0 else None
			
 
				+  depth = int(m.group(4))
			
 
				+  if depth not in [1, 3]:
			
 
				+    raise ValueError('Depth must be 1 or 3, had:', depth)
			
 
				+  return vgsl_input.ImageShape(batch_size, y_size, x_size, depth)
			
 
				+
			
 
				+
			
 
				+def _ParseOutputSpec(output_spec):
			
 
				+  """Parses the output spec.
			
 
				+
			
 
				+  Args:
			
 
				+    output_spec: Output layer definition. See Build.
			
 
				+
			
 
				+  Returns:
			
 
				+    out_dims:     2|1|0 for 2-d, 1-d, 0-d.
			
 
				+    out_func:     l|s|c for logistic, softmax, softmax+CTC
			
 
				+    num_classes:  Number of classes in output.
			
 
				+
			
 
				+  Raises:
			
 
				+    ValueError: if syntax is incorrect.
			
 
				+  """
			
 
				+  pattern = re.compile(R'(O)(0|1|2)(l|s|c)(\d+)')
			
 
				+  m = pattern.match(output_spec)
			
 
				+  if m is None:
			
 
				+    raise ValueError('Failed to parse output spec:' + output_spec)
			
 
				+  out_dims = int(m.group(2))
			
 
				+  out_func = m.group(3)
			
 
				+  if out_func == 'c' and out_dims != 1:
			
 
				+    raise ValueError('CTC can only be used with a 1-D sequence!')
			
 
				+  num_classes = int(m.group(4))
			
 
				+  return out_dims, out_func, num_classes
			
 
				+
			
 
				+
			
 
				+def _AddRateToSummary(tag, rate, step, sw):
			
 
				+  """Adds the given rate to the summary with the given tag.
			
 
				+
			
 
				+  Args:
			
 
				+    tag:   Name for this value.
			
 
				+    rate:  Value to add to the summary. Perhaps an error rate.
			
 
				+    step:  Global step of the graph for the x-coordinate of the summary.
			
 
				+    sw:    Summary writer to which to write the rate value.
			
 
				+  """
			
 
				+  sw.add_summary(
			
 
				+      summary_pb2.Summary(value=[summary_pb2.Summary.Value(
			
 
				+          tag=tag, simple_value=rate)]), step)
			
--- a/street/python/vgsl_model_test.py
+++ b/street/python/vgsl_model_test.py
@@ -0,0 +1,248 @@
 
				+# Copyright 2016 The TensorFlow Authors. All Rights Reserved.
			
 
				+#
			
 
				+# Licensed under the Apache License, Version 2.0 (the "License");
			
 
				+# you may not use this file except in compliance with the License.
			
 
				+# You may obtain a copy of the License at
			
 
				+#
			
 
				+#     http://www.apache.org/licenses/LICENSE-2.0
			
 
				+#
			
 
				+# Unless required by applicable law or agreed to in writing, software
			
 
				+# distributed under the License is distributed on an "AS IS" BASIS,
			
 
				+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
			
 
				+# See the License for the specific language governing permissions and
			
 
				+# limitations under the License.
			
 
				+# ==============================================================================
			
 
				+"""Tests for vgsl_model."""
			
 
				+import os
			
 
				+
			
 
				+import numpy as np
			
 
				+import tensorflow as tf
			
 
				+import vgsl_input
			
 
				+import vgsl_model
			
 
				+
			
 
				+
			
 
				+def _testdata(filename):
			
 
				+  return os.path.join('../testdata/', filename)
			
 
				+
			
 
				+
			
 
				+def _rand(*size):
			
 
				+  return np.random.uniform(size=size).astype('f')
			
 
				+
			
 
				+
			
 
				+class VgslModelTest(tf.test.TestCase):
			
 
				+
			
 
				+  def testParseInputSpec(self):
			
 
				+    """The parser must return the numbers in the correct order.
			
 
				+    """
			
 
				+    shape = vgsl_model._ParseInputSpec(input_spec='32,42,256,3')
			
 
				+    self.assertEqual(
			
 
				+        shape,
			
 
				+        vgsl_input.ImageShape(
			
 
				+            batch_size=32, height=42, width=256, depth=3))
			
 
				+    # Nones must be inserted for zero sizes.
			
 
				+    shape = vgsl_model._ParseInputSpec(input_spec='1,0,0,3')
			
 
				+    self.assertEqual(
			
 
				+        shape,
			
 
				+        vgsl_input.ImageShape(
			
 
				+            batch_size=1, height=None, width=None, depth=3))
			
 
				+
			
 
				+  def testParseOutputSpec(self):
			
 
				+    """The parser must return the correct args in the correct order.
			
 
				+    """
			
 
				+    out_dims, out_func, num_classes = vgsl_model._ParseOutputSpec(
			
 
				+        output_spec='O1c142')
			
 
				+    self.assertEqual(out_dims, 1)
			
 
				+    self.assertEqual(out_func, 'c')
			
 
				+    self.assertEqual(num_classes, 142)
			
 
				+    out_dims, out_func, num_classes = vgsl_model._ParseOutputSpec(
			
 
				+        output_spec='O2s99')
			
 
				+    self.assertEqual(out_dims, 2)
			
 
				+    self.assertEqual(out_func, 's')
			
 
				+    self.assertEqual(num_classes, 99)
			
 
				+    out_dims, out_func, num_classes = vgsl_model._ParseOutputSpec(
			
 
				+        output_spec='O0l12')
			
 
				+    self.assertEqual(out_dims, 0)
			
 
				+    self.assertEqual(out_func, 'l')
			
 
				+    self.assertEqual(num_classes, 12)
			
 
				+
			
 
				+  def testPadLabels2d(self):
			
 
				+    """Must pad timesteps in labels to match logits.
			
 
				+    """
			
 
				+    with self.test_session() as sess:
			
 
				+      # Make placeholders for logits and labels.
			
 
				+      ph_logits = tf.placeholder(tf.float32, shape=(None, None, 42))
			
 
				+      ph_labels = tf.placeholder(tf.int64, shape=(None, None))
			
 
				+      padded_labels = vgsl_model._PadLabels2d(tf.shape(ph_logits)[1], ph_labels)
			
 
				+      # Make actual inputs.
			
 
				+      real_logits = _rand(4, 97, 42)
			
 
				+      real_labels = _rand(4, 85)
			
 
				+      np_array = sess.run([padded_labels],
			
 
				+                          feed_dict={ph_logits: real_logits,
			
 
				+                                     ph_labels: real_labels})[0]
			
 
				+      self.assertEqual(tuple(np_array.shape), (4, 97))
			
 
				+      real_labels = _rand(4, 97)
			
 
				+      np_array = sess.run([padded_labels],
			
 
				+                          feed_dict={ph_logits: real_logits,
			
 
				+                                     ph_labels: real_labels})[0]
			
 
				+      self.assertEqual(tuple(np_array.shape), (4, 97))
			
 
				+      real_labels = _rand(4, 100)
			
 
				+      np_array = sess.run([padded_labels],
			
 
				+                          feed_dict={ph_logits: real_logits,
			
 
				+                                     ph_labels: real_labels})[0]
			
 
				+      self.assertEqual(tuple(np_array.shape), (4, 97))
			
 
				+
			
 
				+  def testPadLabels3d(self):
			
 
				+    """Must pad height and width in labels to match logits.
			
 
				+
			
 
				+    The tricky thing with 3-d is that the rows and columns need to remain
			
 
				+    intact, so we'll test it with small known data.
			
 
				+    """
			
 
				+    with self.test_session() as sess:
			
 
				+      # Make placeholders for logits and labels.
			
 
				+      ph_logits = tf.placeholder(tf.float32, shape=(None, None, None, 42))
			
 
				+      ph_labels = tf.placeholder(tf.int64, shape=(None, None, None))
			
 
				+      padded_labels = vgsl_model._PadLabels3d(ph_logits, ph_labels)
			
 
				+      # Make actual inputs.
			
 
				+      real_logits = _rand(1, 3, 4, 42)
			
 
				+      # Test all 9 combinations of height x width in [small, ok, big]
			
 
				+      real_labels = np.arange(6).reshape((1, 2, 3))  # Height small, width small
			
 
				+      np_array = sess.run([padded_labels],
			
 
				+                          feed_dict={ph_logits: real_logits,
			
 
				+                                     ph_labels: real_labels})[0]
			
 
				+      self.assertEqual(tuple(np_array.shape), (1, 3, 4))
			
 
				+      self.assertAllEqual(np_array[0, :, :],
			
 
				+                          [[0, 1, 2, 0], [3, 4, 5, 0], [0, 0, 0, 0]])
			
 
				+      real_labels = np.arange(8).reshape((1, 2, 4))  # Height small, width ok
			
 
				+      np_array = sess.run([padded_labels],
			
 
				+                          feed_dict={ph_logits: real_logits,
			
 
				+                                     ph_labels: real_labels})[0]
			
 
				+      self.assertEqual(tuple(np_array.shape), (1, 3, 4))
			
 
				+      self.assertAllEqual(np_array[0, :, :],
			
 
				+                          [[0, 1, 2, 3], [4, 5, 6, 7], [0, 0, 0, 0]])
			
 
				+      real_labels = np.arange(10).reshape((1, 2, 5))  # Height small, width big
			
 
				+      np_array = sess.run([padded_labels],
			
 
				+                          feed_dict={ph_logits: real_logits,
			
 
				+                                     ph_labels: real_labels})[0]
			
 
				+      self.assertEqual(tuple(np_array.shape), (1, 3, 4))
			
 
				+      self.assertAllEqual(np_array[0, :, :],
			
 
				+                          [[0, 1, 2, 3], [5, 6, 7, 8], [0, 0, 0, 0]])
			
 
				+      real_labels = np.arange(9).reshape((1, 3, 3))  # Height ok, width small
			
 
				+      np_array = sess.run([padded_labels],
			
 
				+                          feed_dict={ph_logits: real_logits,
			
 
				+                                     ph_labels: real_labels})[0]
			
 
				+      self.assertEqual(tuple(np_array.shape), (1, 3, 4))
			
 
				+      self.assertAllEqual(np_array[0, :, :],
			
 
				+                          [[0, 1, 2, 0], [3, 4, 5, 0], [6, 7, 8, 0]])
			
 
				+      real_labels = np.arange(12).reshape((1, 3, 4))  # Height ok, width ok
			
 
				+      np_array = sess.run([padded_labels],
			
 
				+                          feed_dict={ph_logits: real_logits,
			
 
				+                                     ph_labels: real_labels})[0]
			
 
				+      self.assertEqual(tuple(np_array.shape), (1, 3, 4))
			
 
				+      self.assertAllEqual(np_array[0, :, :],
			
 
				+                          [[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11]])
			
 
				+      real_labels = np.arange(15).reshape((1, 3, 5))  # Height ok, width big
			
 
				+      np_array = sess.run([padded_labels],
			
 
				+                          feed_dict={ph_logits: real_logits,
			
 
				+                                     ph_labels: real_labels})[0]
			
 
				+      self.assertEqual(tuple(np_array.shape), (1, 3, 4))
			
 
				+      self.assertAllEqual(np_array[0, :, :],
			
 
				+                          [[0, 1, 2, 3], [5, 6, 7, 8], [10, 11, 12, 13]])
			
 
				+      real_labels = np.arange(12).reshape((1, 4, 3))  # Height big, width small
			
 
				+      np_array = sess.run([padded_labels],
			
 
				+                          feed_dict={ph_logits: real_logits,
			
 
				+                                     ph_labels: real_labels})[0]
			
 
				+      self.assertEqual(tuple(np_array.shape), (1, 3, 4))
			
 
				+      self.assertAllEqual(np_array[0, :, :],
			
 
				+                          [[0, 1, 2, 0], [3, 4, 5, 0], [6, 7, 8, 0]])
			
 
				+      real_labels = np.arange(16).reshape((1, 4, 4))  # Height big, width ok
			
 
				+      np_array = sess.run([padded_labels],
			
 
				+                          feed_dict={ph_logits: real_logits,
			
 
				+                                     ph_labels: real_labels})[0]
			
 
				+      self.assertEqual(tuple(np_array.shape), (1, 3, 4))
			
 
				+      self.assertAllEqual(np_array[0, :, :],
			
 
				+                          [[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11]])
			
 
				+      real_labels = np.arange(20).reshape((1, 4, 5))  # Height big, width big
			
 
				+      np_array = sess.run([padded_labels],
			
 
				+                          feed_dict={ph_logits: real_logits,
			
 
				+                                     ph_labels: real_labels})[0]
			
 
				+      self.assertEqual(tuple(np_array.shape), (1, 3, 4))
			
 
				+      self.assertAllEqual(np_array[0, :, :],
			
 
				+                          [[0, 1, 2, 3], [5, 6, 7, 8], [10, 11, 12, 13]])
			
 
				+
			
 
				+  def testEndToEndSizes0d(self):
			
 
				+    """Tests that the output sizes match when training/running real 0d data.
			
 
				+
			
 
				+    Uses mnist with dual summarizing LSTMs to reduce to a single value.
			
 
				+    """
			
 
				+    filename = _testdata('mnist-tiny')
			
 
				+    with self.test_session() as sess:
			
 
				+      model = vgsl_model.InitNetwork(
			
 
				+          filename,
			
 
				+          model_spec='4,0,0,1[Cr5,5,16 Mp3,3 Lfys16 Lfxs16]O0s12',
			
 
				+          mode='train')
			
 
				+      tf.initialize_all_variables().run(session=sess)
			
 
				+      coord = tf.train.Coordinator()
			
 
				+      tf.train.start_queue_runners(sess=sess, coord=coord)
			
 
				+      _, step = model.TrainAStep(sess)
			
 
				+      self.assertEqual(step, 1)
			
 
				+      output, labels = model.RunAStep(sess)
			
 
				+      self.assertEqual(len(output.shape), 2)
			
 
				+      self.assertEqual(len(labels.shape), 1)
			
 
				+      self.assertEqual(output.shape[0], labels.shape[0])
			
 
				+      self.assertEqual(output.shape[1], 12)
			
 
				+
			
 
				+  # TODO(rays) Support logistic and test with Imagenet (as 0d, multi-object.)
			
 
				+
			
 
				+  def testEndToEndSizes1dCTC(self):
			
 
				+    """Tests that the output sizes match when training with CTC.
			
 
				+
			
 
				+    Basic bidi LSTM on top of convolution and summarizing LSTM with CTC.
			
 
				+    """
			
 
				+    filename = _testdata('arial-32-tiny')
			
 
				+    with self.test_session() as sess:
			
 
				+      model = vgsl_model.InitNetwork(
			
 
				+          filename,
			
 
				+          model_spec='2,0,0,1[Cr5,5,16 Mp3,3 Lfys16 Lbx100]O1c105',
			
 
				+          mode='train')
			
 
				+      tf.initialize_all_variables().run(session=sess)
			
 
				+      coord = tf.train.Coordinator()
			
 
				+      tf.train.start_queue_runners(sess=sess, coord=coord)
			
 
				+      _, step = model.TrainAStep(sess)
			
 
				+      self.assertEqual(step, 1)
			
 
				+      output, labels = model.RunAStep(sess)
			
 
				+      self.assertEqual(len(output.shape), 3)
			
 
				+      self.assertEqual(len(labels.shape), 2)
			
 
				+      self.assertEqual(output.shape[0], labels.shape[0])
			
 
				+      # This is ctc - the only cast-iron guarantee is labels <= output.
			
 
				+      self.assertLessEqual(labels.shape[1], output.shape[1])
			
 
				+      self.assertEqual(output.shape[2], 105)
			
 
				+
			
 
				+  def testEndToEndSizes1dFixed(self):
			
 
				+    """Tests that the output sizes match when training/running 1 data.
			
 
				+
			
 
				+    Convolution, summarizing LSTM with fwd rev fwd to allow no CTC.
			
 
				+    """
			
 
				+    filename = _testdata('numbers-16-tiny')
			
 
				+    with self.test_session() as sess:
			
 
				+      model = vgsl_model.InitNetwork(
			
 
				+          filename,
			
 
				+          model_spec='8,0,0,1[Cr5,5,16 Mp3,3 Lfys16 Lfx64 Lrx64 Lfx64]O1s12',
			
 
				+          mode='train')
			
 
				+      tf.initialize_all_variables().run(session=sess)
			
 
				+      coord = tf.train.Coordinator()
			
 
				+      tf.train.start_queue_runners(sess=sess, coord=coord)
			
 
				+      _, step = model.TrainAStep(sess)
			
 
				+      self.assertEqual(step, 1)
			
 
				+      output, labels = model.RunAStep(sess)
			
 
				+      self.assertEqual(len(output.shape), 3)
			
 
				+      self.assertEqual(len(labels.shape), 2)
			
 
				+      self.assertEqual(output.shape[0], labels.shape[0])
			
 
				+      # Not CTC, output lengths match.
			
 
				+      self.assertEqual(output.shape[1], labels.shape[1])
			
 
				+      self.assertEqual(output.shape[2], 12)
			
 
				+
			
 
				+  # TODO(rays) Get a 2-d dataset and support 2d (heat map) outputs.
			
 
				+
			
 
				+
			
 
				+if __name__ == '__main__':
			
 
				+  tf.test.main()
			
--- a/street/python/vgsl_train.py
+++ b/street/python/vgsl_train.py
@@ -0,0 +1,55 @@
 
				+# Copyright 2016 The TensorFlow Authors. All Rights Reserved.
			
 
				+#
			
 
				+# Licensed under the Apache License, Version 2.0 (the "License");
			
 
				+# you may not use this file except in compliance with the License.
			
 
				+# You may obtain a copy of the License at
			
 
				+#
			
 
				+#     http://www.apache.org/licenses/LICENSE-2.0
			
 
				+#
			
 
				+# Unless required by applicable law or agreed to in writing, software
			
 
				+# distributed under the License is distributed on an "AS IS" BASIS,
			
 
				+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
			
 
				+# See the License for the specific language governing permissions and
			
 
				+# limitations under the License.
			
 
				+# ==============================================================================
			
 
				+"""Model trainer for single or multi-replica training."""
			
 
				+from tensorflow import app
			
 
				+from tensorflow.python.platform import flags
			
 
				+
			
 
				+import vgsl_model
			
 
				+
			
 
				+flags.DEFINE_string('master', '', 'Name of the TensorFlow master to use.')
			
 
				+flags.DEFINE_string('train_dir', '/tmp/mdir',
			
 
				+                    'Directory where to write event logs.')
			
 
				+flags.DEFINE_string('model_str',
			
 
				+                    '1,150,600,3[S2(4x150)0,2 Ct5,5,16 Mp2,2 Ct5,5,64 Mp3,3'
			
 
				+                    '([Lrys64 Lbx128][Lbys64 Lbx128][Lfys64 Lbx128])S3(3x0)2,3'
			
 
				+                    'Lfx128 Lrx128 S0(1x4)0,3 Do Lfx256]O1c134',
			
 
				+                    'Network description.')
			
 
				+flags.DEFINE_integer('max_steps', 10000, 'Number of steps to train for.')
			
 
				+flags.DEFINE_integer('task', 0, 'Task id of the replica running the training.')
			
 
				+flags.DEFINE_integer('ps_tasks', 0, 'Number of tasks in the ps job.'
			
 
				+                     'If 0 no ps job is used.')
			
 
				+flags.DEFINE_string('train_data', None, 'Training data filepattern')
			
 
				+flags.DEFINE_float('initial_learning_rate', 0.00002, 'Initial learning rate')
			
 
				+flags.DEFINE_float('final_learning_rate', 0.00002, 'Final learning rate')
			
 
				+flags.DEFINE_integer('learning_rate_halflife', 1600000,
			
 
				+                     'Halflife of learning rate')
			
 
				+flags.DEFINE_string('optimizer_type', 'Adam',
			
 
				+                    'Optimizer from:GradientDescent, AdaGrad, Momentum, Adam')
			
 
				+flags.DEFINE_integer('num_preprocess_threads', 4, 'Number of input threads')
			
 
				+
			
 
				+FLAGS = flags.FLAGS
			
 
				+
			
 
				+
			
 
				+def main(argv):
			
 
				+  del argv
			
 
				+  vgsl_model.Train(FLAGS.train_dir, FLAGS.model_str, FLAGS.train_data,
			
 
				+                   FLAGS.max_steps, FLAGS.master, FLAGS.task, FLAGS.ps_tasks,
			
 
				+                   FLAGS.initial_learning_rate, FLAGS.final_learning_rate,
			
 
				+                   FLAGS.learning_rate_halflife, FLAGS.optimizer_type,
			
 
				+                   FLAGS.num_preprocess_threads)
			
 
				+
			
 
				+
			
 
				+if __name__ == '__main__':
			
 
				+  app.run()
			
--- a/street/python/vgslspecs.py
+++ b/street/python/vgslspecs.py
@@ -0,0 +1,533 @@
 
				+# Copyright 2016 The TensorFlow Authors. All Rights Reserved.
			
 
				+#
			
 
				+# Licensed under the Apache License, Version 2.0 (the "License");
			
 
				+# you may not use this file except in compliance with the License.
			
 
				+# You may obtain a copy of the License at
			
 
				+#
			
 
				+#     http://www.apache.org/licenses/LICENSE-2.0
			
 
				+#
			
 
				+# Unless required by applicable law or agreed to in writing, software
			
 
				+# distributed under the License is distributed on an "AS IS" BASIS,
			
 
				+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
			
 
				+# See the License for the specific language governing permissions and
			
 
				+# limitations under the License.
			
 
				+# ==============================================================================
			
 
				+
			
 
				+"""String network description language mapping to TF-Slim calls where possible.
			
 
				+
			
 
				+See vglspecs.md for detailed description.
			
 
				+"""
			
 
				+
			
 
				+import re
			
 
				+from string import maketrans
			
 
				+
			
 
				+import nn_ops
			
 
				+import shapes
			
 
				+import tensorflow as tf
			
 
				+import tensorflow.contrib.slim as slim
			
 
				+
			
 
				+
			
 
				+# Class that builds a set of ops to manipulate variable-sized images.
			
 
				+class VGSLSpecs(object):
			
 
				+  """Layers that can be built from a string definition."""
			
 
				+
			
 
				+  def __init__(self, widths, heights, is_training):
			
 
				+    """Constructs a VGSLSpecs.
			
 
				+
			
 
				+    Args:
			
 
				+      widths:  Tensor of size batch_size of the widths of the inputs.
			
 
				+      heights: Tensor of size batch_size of the heights of the inputs.
			
 
				+      is_training: True if the graph should be build for training.
			
 
				+    """
			
 
				+    # The string that was used to build this model.
			
 
				+    self.model_str = None
			
 
				+    # True if we are training
			
 
				+    self.is_training = is_training
			
 
				+    # Tensor for the size of the images, of size batch_size.
			
 
				+    self.widths = widths
			
 
				+    self.heights = heights
			
 
				+    # Overall reduction factors of this model so far for each dimension.
			
 
				+    # TODO(rays) consider building a graph from widths and heights instead of
			
 
				+    # computing a scale factor.
			
 
				+    self.reduction_factors = [1.0, 1.0, 1.0, 1.0]
			
 
				+    # List of Op parsers.
			
 
				+    # TODO(rays) add more Op types as needed.
			
 
				+    self.valid_ops = [self.AddSeries, self.AddParallel, self.AddConvLayer,
			
 
				+                      self.AddMaxPool, self.AddDropout, self.AddReShape,
			
 
				+                      self.AddFCLayer, self.AddLSTMLayer]
			
 
				+    # Translation table to convert unacceptable characters that may occur
			
 
				+    # in op strings that cannot be used as names.
			
 
				+    self.transtab = maketrans('(,)', '___')
			
 
				+
			
 
				+  def Build(self, prev_layer, model_str):
			
 
				+    """Builds a network with input prev_layer from a VGSLSpecs description.
			
 
				+
			
 
				+    Args:
			
 
				+      prev_layer: The input tensor.
			
 
				+      model_str:  Model definition similar to Tesseract as follows:
			
 
				+        ============ FUNCTIONAL OPS ============
			
 
				+        C(s|t|r|l|m)[{name}]<y>,<x>,<d> Convolves using a y,x window, with no
			
 
				+          shrinkage, SAME infill, d outputs, with s|t|r|l|m non-linear layer.
			
 
				+          (s|t|r|l|m) specifies the type of non-linearity:
			
 
				+          s = sigmoid
			
 
				+          t = tanh
			
 
				+          r = relu
			
 
				+          l = linear (i.e., None)
			
 
				+          m = softmax
			
 
				+        F(s|t|r|l|m)[{name}]<d> Fully-connected with s|t|r|l|m non-linearity and
			
 
				+          d outputs. Reduces height, width to 1. Input height and width must be
			
 
				+          constant.
			
 
				+        L(f|r|b)(x|y)[s][{name}]<n> LSTM cell with n outputs.
			
 
				+          f runs the LSTM forward only.
			
 
				+          r runs the LSTM reversed only.
			
 
				+          b runs the LSTM bidirectionally.
			
 
				+          x runs the LSTM in the x-dimension (on data with or without the
			
 
				+             y-dimension).
			
 
				+          y runs the LSTM in the y-dimension (data must have a y dimension).
			
 
				+          s (optional) summarizes the output in the requested dimension,
			
 
				+             outputting only the final step, collapsing the dimension to a
			
 
				+             single element.
			
 
				+          Examples:
			
 
				+          Lfx128 runs a forward-only LSTM in the x-dimension with 128
			
 
				+                 outputs, treating any y dimension independently.
			
 
				+          Lfys64 runs a forward-only LSTM in the y-dimension with 64 outputs
			
 
				+                 and collapses the y-dimension to 1 element.
			
 
				+          NOTE that Lbxsn is implemented as (LfxsnLrxsn) since the summaries
			
 
				+          need to be taken from opposite ends of the output
			
 
				+        Do[{name}] Insert a dropout layer.
			
 
				+        ============ PLUMBING OPS ============
			
 
				+        [...] Execute ... networks in series (layers).
			
 
				+        (...) Execute ... networks in parallel, with their output concatenated
			
 
				+          in depth.
			
 
				+        S[{name}]<d>(<a>x<b>)<e>,<f> Splits one dimension, moves one part to
			
 
				+          another dimension.
			
 
				+          Splits input dimension d into a x b, sending the high part (a) to the
			
 
				+          high side of dimension e, and the low part (b) to the high side of
			
 
				+          dimension f. Exception: if d=e=f, then then dimension d is internally
			
 
				+          transposed to bxa.
			
 
				+          Either a or b can be zero, meaning whatever is left after taking out
			
 
				+          the other, allowing dimensions to be of variable size.
			
 
				+          Eg. S3(3x50)2,3 will split the 150-element depth into 3x50, with the 3
			
 
				+          going to the most significant part of the width, and the 50 part
			
 
				+          staying in depth.
			
 
				+          This will rearrange a 3x50 output parallel operation to spread the 3
			
 
				+          output sets over width.
			
 
				+        Mp[{name}]<y>,<x> Maxpool the input, reducing the (y,x) rectangle to a
			
 
				+          single vector value.
			
 
				+
			
 
				+    Returns:
			
 
				+      Output tensor
			
 
				+    """
			
 
				+    self.model_str = model_str
			
 
				+    final_layer, _ = self.BuildFromString(prev_layer, 0)
			
 
				+    return final_layer
			
 
				+
			
 
				+  def GetLengths(self, dim=2, factor=1):
			
 
				+    """Returns the lengths of the batch of elements in the given dimension.
			
 
				+
			
 
				+    WARNING: The returned sizes may not exactly match TF's calculation.
			
 
				+    Args:
			
 
				+      dim: dimension to get the sizes of, in [1,2]. batch, depth not allowed.
			
 
				+      factor: A scalar value to multiply by.
			
 
				+
			
 
				+    Returns:
			
 
				+      The original heights/widths scaled by the current scaling of the model and
			
 
				+      the given factor.
			
 
				+
			
 
				+    Raises:
			
 
				+      ValueError: If the args are invalid.
			
 
				+    """
			
 
				+    if dim == 1:
			
 
				+      lengths = self.heights
			
 
				+    elif dim == 2:
			
 
				+      lengths = self.widths
			
 
				+    else:
			
 
				+      raise ValueError('Invalid dimension given to GetLengths')
			
 
				+    lengths = tf.cast(lengths, tf.float32)
			
 
				+    if self.reduction_factors[dim] is not None:
			
 
				+      lengths = tf.div(lengths, self.reduction_factors[dim])
			
 
				+    else:
			
 
				+      lengths = tf.ones_like(lengths)
			
 
				+    if factor != 1:
			
 
				+      lengths = tf.mul(lengths, tf.cast(factor, tf.float32))
			
 
				+    return tf.cast(lengths, tf.int32)
			
 
				+
			
 
				+  def BuildFromString(self, prev_layer, index):
			
 
				+    """Adds the layers defined by model_str[index:] to the model.
			
 
				+
			
 
				+    Args:
			
 
				+      prev_layer: Input tensor.
			
 
				+      index:      Position in model_str to start parsing
			
 
				+
			
 
				+    Returns:
			
 
				+      Output tensor, next model_str index.
			
 
				+
			
 
				+    Raises:
			
 
				+      ValueError: If the model string is unrecognized.
			
 
				+    """
			
 
				+    index = self._SkipWhitespace(index)
			
 
				+    for op in self.valid_ops:
			
 
				+      output_layer, next_index = op(prev_layer, index)
			
 
				+      if output_layer is not None:
			
 
				+        return output_layer, next_index
			
 
				+    if output_layer is not None:
			
 
				+      return output_layer, next_index
			
 
				+    raise ValueError('Unrecognized model string:' + self.model_str[index:])
			
 
				+
			
 
				+  def AddSeries(self, prev_layer, index):
			
 
				+    """Builds a sequence of layers for a VGSLSpecs model.
			
 
				+
			
 
				+    Args:
			
 
				+      prev_layer: Input tensor.
			
 
				+      index:      Position in model_str to start parsing
			
 
				+
			
 
				+    Returns:
			
 
				+      Output tensor of the series, end index in model_str.
			
 
				+
			
 
				+    Raises:
			
 
				+      ValueError: If [] are unbalanced.
			
 
				+    """
			
 
				+    if self.model_str[index] != '[':
			
 
				+      return None, None
			
 
				+    index += 1
			
 
				+    while index < len(self.model_str) and self.model_str[index] != ']':
			
 
				+      prev_layer, index = self.BuildFromString(prev_layer, index)
			
 
				+    if index == len(self.model_str):
			
 
				+      raise ValueError('Missing ] at end of series!' + self.model_str)
			
 
				+    return prev_layer, index + 1
			
 
				+
			
 
				+  def AddParallel(self, prev_layer, index):
			
 
				+    """tf.concats outputs of layers that run on the same inputs.
			
 
				+
			
 
				+    Args:
			
 
				+      prev_layer: Input tensor.
			
 
				+      index:      Position in model_str to start parsing
			
 
				+
			
 
				+    Returns:
			
 
				+      Output tensor of the parallel,  end index in model_str.
			
 
				+
			
 
				+    Raises:
			
 
				+      ValueError: If () are unbalanced or the elements don't match.
			
 
				+    """
			
 
				+    if self.model_str[index] != '(':
			
 
				+      return None, None
			
 
				+    index += 1
			
 
				+    layers = []
			
 
				+    num_dims = 0
			
 
				+    # Each parallel must output the same, including any reduction factor, in
			
 
				+    # all dimensions except depth.
			
 
				+    # We have to save the starting factors, so they don't get reduced by all
			
 
				+    # the elements of the parallel, only once.
			
 
				+    original_factors = self.reduction_factors
			
 
				+    final_factors = None
			
 
				+    while index < len(self.model_str) and self.model_str[index] != ')':
			
 
				+      self.reduction_factors = original_factors
			
 
				+      layer, index = self.BuildFromString(prev_layer, index)
			
 
				+      if num_dims == 0:
			
 
				+        num_dims = len(layer.get_shape())
			
 
				+      elif num_dims != len(layer.get_shape()):
			
 
				+        raise ValueError('All elements of parallel must return same num dims')
			
 
				+      layers.append(layer)
			
 
				+      if final_factors:
			
 
				+        if final_factors != self.reduction_factors:
			
 
				+          raise ValueError('All elements of parallel must scale the same')
			
 
				+      else:
			
 
				+        final_factors = self.reduction_factors
			
 
				+    if index == len(self.model_str):
			
 
				+      raise ValueError('Missing ) at end of parallel!' + self.model_str)
			
 
				+    return tf.concat(num_dims - 1, layers), index + 1
			
 
				+
			
 
				+  def AddConvLayer(self, prev_layer, index):
			
 
				+    """Add a single standard convolutional layer.
			
 
				+
			
 
				+    Args:
			
 
				+      prev_layer: Input tensor.
			
 
				+      index:      Position in model_str to start parsing
			
 
				+
			
 
				+    Returns:
			
 
				+      Output tensor, end index in model_str.
			
 
				+    """
			
 
				+    pattern = re.compile(R'(C)(s|t|r|l|m)({\w+})?(\d+),(\d+),(\d+)')
			
 
				+    m = pattern.match(self.model_str, index)
			
 
				+    if m is None:
			
 
				+      return None, None
			
 
				+    name = self._GetLayerName(m.group(0), index, m.group(3))
			
 
				+    width = int(m.group(4))
			
 
				+    height = int(m.group(5))
			
 
				+    depth = int(m.group(6))
			
 
				+    fn = self._NonLinearity(m.group(2))
			
 
				+    return slim.conv2d(
			
 
				+        prev_layer, depth, [height, width], activation_fn=fn,
			
 
				+        scope=name), m.end()
			
 
				+
			
 
				+  def AddMaxPool(self, prev_layer, index):
			
 
				+    """Add a maxpool layer.
			
 
				+
			
 
				+    Args:
			
 
				+      prev_layer: Input tensor.
			
 
				+      index:      Position in model_str to start parsing
			
 
				+
			
 
				+    Returns:
			
 
				+      Output tensor, end index in model_str.
			
 
				+    """
			
 
				+    pattern = re.compile(R'(Mp)({\w+})?(\d+),(\d+)(?:,(\d+),(\d+))?')
			
 
				+    m = pattern.match(self.model_str, index)
			
 
				+    if m is None:
			
 
				+      return None, None
			
 
				+    name = self._GetLayerName(m.group(0), index, m.group(2))
			
 
				+    height = int(m.group(3))
			
 
				+    width = int(m.group(4))
			
 
				+    y_stride = height if m.group(5) is None else m.group(5)
			
 
				+    x_stride = width if m.group(6) is None else m.group(6)
			
 
				+    self.reduction_factors[1] *= y_stride
			
 
				+    self.reduction_factors[2] *= x_stride
			
 
				+    return slim.max_pool2d(
			
 
				+        prev_layer, [height, width], [y_stride, x_stride],
			
 
				+        padding='SAME',
			
 
				+        scope=name), m.end()
			
 
				+
			
 
				+  def AddDropout(self, prev_layer, index):
			
 
				+    """Adds a dropout layer.
			
 
				+
			
 
				+    Args:
			
 
				+      prev_layer: Input tensor.
			
 
				+      index:      Position in model_str to start parsing
			
 
				+
			
 
				+    Returns:
			
 
				+      Output tensor, end index in model_str.
			
 
				+    """
			
 
				+    pattern = re.compile(R'(Do)({\w+})?')
			
 
				+    m = pattern.match(self.model_str, index)
			
 
				+    if m is None:
			
 
				+      return None, None
			
 
				+    name = self._GetLayerName(m.group(0), index, m.group(2))
			
 
				+    layer = slim.dropout(
			
 
				+        prev_layer, 0.5, is_training=self.is_training, scope=name)
			
 
				+    return layer, m.end()
			
 
				+
			
 
				+  def AddReShape(self, prev_layer, index):
			
 
				+    """Reshapes the input tensor by moving each (x_scale,y_scale) rectangle to.
			
 
				+
			
 
				+       the depth dimension. NOTE that the TF convention is that inputs are
			
 
				+       [batch, y, x, depth].
			
 
				+
			
 
				+    Args:
			
 
				+      prev_layer: Input tensor.
			
 
				+      index:      Position in model_str to start parsing
			
 
				+
			
 
				+    Returns:
			
 
				+      Output tensor, end index in model_str.
			
 
				+    """
			
 
				+    pattern = re.compile(R'(S)(?:{(\w)})?(\d+)\((\d+)x(\d+)\)(\d+),(\d+)')
			
 
				+    m = pattern.match(self.model_str, index)
			
 
				+    if m is None:
			
 
				+      return None, None
			
 
				+    name = self._GetLayerName(m.group(0), index, m.group(2))
			
 
				+    src_dim = int(m.group(3))
			
 
				+    part_a = int(m.group(4))
			
 
				+    part_b = int(m.group(5))
			
 
				+    dest_dim_a = int(m.group(6))
			
 
				+    dest_dim_b = int(m.group(7))
			
 
				+    if part_a == 0:
			
 
				+      part_a = -1
			
 
				+    if part_b == 0:
			
 
				+      part_b = -1
			
 
				+    prev_shape = tf.shape(prev_layer)
			
 
				+    layer = shapes.transposing_reshape(
			
 
				+        prev_layer, src_dim, part_a, part_b, dest_dim_a, dest_dim_b, name=name)
			
 
				+    # Compute scale factors.
			
 
				+    result_shape = tf.shape(layer)
			
 
				+    for i in xrange(len(self.reduction_factors)):
			
 
				+      if self.reduction_factors[i] is not None:
			
 
				+        factor1 = tf.cast(self.reduction_factors[i], tf.float32)
			
 
				+        factor2 = tf.cast(prev_shape[i], tf.float32)
			
 
				+        divisor = tf.cast(result_shape[i], tf.float32)
			
 
				+        self.reduction_factors[i] = tf.div(tf.mul(factor1, factor2), divisor)
			
 
				+    return layer, m.end()
			
 
				+
			
 
				+  def AddFCLayer(self, prev_layer, index):
			
 
				+    """Parse expression and add Fully Connected Layer.
			
 
				+
			
 
				+    Args:
			
 
				+      prev_layer: Input tensor.
			
 
				+      index:      Position in model_str to start parsing
			
 
				+
			
 
				+    Returns:
			
 
				+      Output tensor, end index in model_str.
			
 
				+    """
			
 
				+    pattern = re.compile(R'(F)(s|t|r|l|m)({\w+})?(\d+)')
			
 
				+    m = pattern.match(self.model_str, index)
			
 
				+    if m is None:
			
 
				+      return None, None
			
 
				+    fn = self._NonLinearity(m.group(2))
			
 
				+    name = self._GetLayerName(m.group(0), index, m.group(3))
			
 
				+    depth = int(m.group(4))
			
 
				+    input_depth = shapes.tensor_dim(prev_layer, 1) * shapes.tensor_dim(
			
 
				+        prev_layer, 2) * shapes.tensor_dim(prev_layer, 3)
			
 
				+    # The slim fully connected is actually a 1x1 conv, so we have to crush the
			
 
				+    # dimensions on input.
			
 
				+    # Everything except batch goes to depth, and therefore has to be known.
			
 
				+    shaped = tf.reshape(
			
 
				+        prev_layer, [-1, input_depth], name=name + '_reshape_in')
			
 
				+    output = slim.fully_connected(shaped, depth, activation_fn=fn, scope=name)
			
 
				+    # Width and height are collapsed to 1.
			
 
				+    self.reduction_factors[1] = None
			
 
				+    self.reduction_factors[2] = None
			
 
				+    return tf.reshape(
			
 
				+        output, [shapes.tensor_dim(prev_layer, 0), 1, 1, depth],
			
 
				+        name=name + '_reshape_out'), m.end()
			
 
				+
			
 
				+  def AddLSTMLayer(self, prev_layer, index):
			
 
				+    """Parse expression and add LSTM Layer.
			
 
				+
			
 
				+    Args:
			
 
				+      prev_layer: Input tensor.
			
 
				+      index:      Position in model_str to start parsing
			
 
				+
			
 
				+    Returns:
			
 
				+      Output tensor, end index in model_str.
			
 
				+    """
			
 
				+    pattern = re.compile(R'(L)(f|r|b)(x|y)(s)?({\w+})?(\d+)')
			
 
				+    m = pattern.match(self.model_str, index)
			
 
				+    if m is None:
			
 
				+      return None, None
			
 
				+    direction = m.group(2)
			
 
				+    dim = m.group(3)
			
 
				+    summarize = m.group(4) == 's'
			
 
				+    name = self._GetLayerName(m.group(0), index, m.group(5))
			
 
				+    depth = int(m.group(6))
			
 
				+    if direction == 'b' and summarize:
			
 
				+      fwd = self._LSTMLayer(prev_layer, 'forward', dim, True, depth,
			
 
				+                            name + '_forward')
			
 
				+      back = self._LSTMLayer(prev_layer, 'backward', dim, True, depth,
			
 
				+                             name + '_reverse')
			
 
				+      return tf.concat(3, [fwd, back], name=name + '_concat'), m.end()
			
 
				+    if direction == 'f':
			
 
				+      direction = 'forward'
			
 
				+    elif direction == 'r':
			
 
				+      direction = 'backward'
			
 
				+    else:
			
 
				+      direction = 'bidirectional'
			
 
				+    outputs = self._LSTMLayer(prev_layer, direction, dim, summarize, depth,
			
 
				+                              name)
			
 
				+    if summarize:
			
 
				+      # The x or y dimension is getting collapsed.
			
 
				+      if dim == 'x':
			
 
				+        self.reduction_factors[2] = None
			
 
				+      else:
			
 
				+        self.reduction_factors[1] = None
			
 
				+    return outputs, m.end()
			
 
				+
			
 
				+  def _LSTMLayer(self, prev_layer, direction, dim, summarize, depth, name):
			
 
				+    """Adds an LSTM layer with the given pre-parsed attributes.
			
 
				+
			
 
				+    Always maps 4-D to 4-D regardless of summarize.
			
 
				+    Args:
			
 
				+      prev_layer: Input tensor.
			
 
				+      direction:  'forward' 'backward' or 'bidirectional'
			
 
				+      dim:        'x' or 'y', dimension to consider as time.
			
 
				+      summarize:  True if we are to return only the last timestep.
			
 
				+      depth:      Output depth.
			
 
				+      name:       Some string naming the op.
			
 
				+
			
 
				+    Returns:
			
 
				+      Output tensor.
			
 
				+    """
			
 
				+    # If the target dimension is y, we need to transpose.
			
 
				+    if dim == 'x':
			
 
				+      lengths = self.GetLengths(2, 1)
			
 
				+      inputs = prev_layer
			
 
				+    else:
			
 
				+      lengths = self.GetLengths(1, 1)
			
 
				+      inputs = tf.transpose(prev_layer, [0, 2, 1, 3], name=name + '_ytrans_in')
			
 
				+    input_batch = shapes.tensor_dim(inputs, 0)
			
 
				+    num_slices = shapes.tensor_dim(inputs, 1)
			
 
				+    num_steps = shapes.tensor_dim(inputs, 2)
			
 
				+    input_depth = shapes.tensor_dim(inputs, 3)
			
 
				+    # Reshape away the other dimension.
			
 
				+    inputs = tf.reshape(
			
 
				+        inputs, [-1, num_steps, input_depth], name=name + '_reshape_in')
			
 
				+    # We need to replicate the lengths by the size of the other dimension, and
			
 
				+    # any changes that have been made to the batch dimension.
			
 
				+    tile_factor = tf.to_float(input_batch *
			
 
				+                              num_slices) / tf.to_float(tf.shape(lengths)[0])
			
 
				+    lengths = tf.tile(lengths, [tf.cast(tile_factor, tf.int32)])
			
 
				+    lengths = tf.cast(lengths, tf.int64)
			
 
				+    outputs = nn_ops.rnn_helper(
			
 
				+        inputs,
			
 
				+        lengths,
			
 
				+        cell_type='lstm',
			
 
				+        num_nodes=depth,
			
 
				+        direction=direction,
			
 
				+        name=name,
			
 
				+        stddev=0.1)
			
 
				+    # Output depth is doubled if bi-directional.
			
 
				+    if direction == 'bidirectional':
			
 
				+      output_depth = depth * 2
			
 
				+    else:
			
 
				+      output_depth = depth
			
 
				+    # Restore the other dimension.
			
 
				+    if summarize:
			
 
				+      outputs = tf.slice(
			
 
				+          outputs, [0, num_steps - 1, 0], [-1, 1, -1], name=name + '_sum_slice')
			
 
				+      outputs = tf.reshape(
			
 
				+          outputs, [input_batch, num_slices, 1, output_depth],
			
 
				+          name=name + '_reshape_out')
			
 
				+    else:
			
 
				+      outputs = tf.reshape(
			
 
				+          outputs, [input_batch, num_slices, num_steps, output_depth],
			
 
				+          name=name + '_reshape_out')
			
 
				+    if dim == 'y':
			
 
				+      outputs = tf.transpose(outputs, [0, 2, 1, 3], name=name + '_ytrans_out')
			
 
				+    return outputs
			
 
				+
			
 
				+  def _NonLinearity(self, code):
			
 
				+    """Returns the non-linearity function pointer for the given string code.
			
 
				+
			
 
				+    For forwards compatibility, allows the full names for stand-alone
			
 
				+    non-linearities, as well as the single-letter names used in ops like C,F.
			
 
				+    Args:
			
 
				+      code: String code representing a non-linearity function.
			
 
				+    Returns:
			
 
				+      non-linearity function represented by the code.
			
 
				+    """
			
 
				+    if code in ['s', 'Sig']:
			
 
				+      return tf.sigmoid
			
 
				+    elif code in ['t', 'Tanh']:
			
 
				+      return tf.tanh
			
 
				+    elif code in ['r', 'Relu']:
			
 
				+      return tf.nn.relu
			
 
				+    elif code in ['m', 'Smax']:
			
 
				+      return tf.nn.softmax
			
 
				+    return None
			
 
				+
			
 
				+  def _GetLayerName(self, op_str, index, name_str):
			
 
				+    """Generates a name for the op, using a user-supplied name if possible.
			
 
				+
			
 
				+    Args:
			
 
				+      op_str:     String representing the parsed op.
			
 
				+      index:      Position in model_str of the start of the op.
			
 
				+      name_str:   User-supplied {name} with {} that need removing or None.
			
 
				+
			
 
				+    Returns:
			
 
				+      Selected name.
			
 
				+    """
			
 
				+    if name_str:
			
 
				+      return name_str[1:-1]
			
 
				+    else:
			
 
				+      return op_str.translate(self.transtab) + '_' + str(index)
			
 
				+
			
 
				+  def _SkipWhitespace(self, index):
			
 
				+    """Skips any leading whitespace in the model description.
			
 
				+
			
 
				+    Args:
			
 
				+      index:      Position in model_str to start parsing
			
 
				+
			
 
				+    Returns:
			
 
				+      end index in model_str of whitespace.
			
 
				+    """
			
 
				+    pattern = re.compile(R'([ \t\n]+)')
			
 
				+    m = pattern.match(self.model_str, index)
			
 
				+    if m is None:
			
 
				+      return index
			
 
				+    return m.end()
			
--- a/street/python/vgslspecs_test.py
+++ b/street/python/vgslspecs_test.py
@@ -0,0 +1,122 @@
 
				+# Copyright 2016 The TensorFlow Authors. All Rights Reserved.
			
 
				+#
			
 
				+# Licensed under the Apache License, Version 2.0 (the "License");
			
 
				+# you may not use this file except in compliance with the License.
			
 
				+# You may obtain a copy of the License at
			
 
				+#
			
 
				+#     http://www.apache.org/licenses/LICENSE-2.0
			
 
				+#
			
 
				+# Unless required by applicable law or agreed to in writing, software
			
 
				+# distributed under the License is distributed on an "AS IS" BASIS,
			
 
				+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
			
 
				+# See the License for the specific language governing permissions and
			
 
				+# limitations under the License.
			
 
				+# ==============================================================================
			
 
				+"""Tests for vgslspecs."""
			
 
				+
			
 
				+import numpy as np
			
 
				+import tensorflow as tf
			
 
				+import vgslspecs
			
 
				+
			
 
				+
			
 
				+def _rand(*size):
			
 
				+  return np.random.uniform(size=size).astype('f')
			
 
				+
			
 
				+
			
 
				+class VgslspecsTest(tf.test.TestCase):
			
 
				+
			
 
				+  def __init__(self, other):
			
 
				+    super(VgslspecsTest, self).__init__(other)
			
 
				+    self.max_width = 36
			
 
				+    self.max_height = 24
			
 
				+    self.batch_size = 4
			
 
				+
			
 
				+  def SetupInputs(self):
			
 
				+    # Make placeholders for standard inputs.
			
 
				+    # Everything is variable in the input, except the depth.
			
 
				+    self.ph_image = tf.placeholder(
			
 
				+        tf.float32, shape=(None, None, None, 3), name='inputs')
			
 
				+    self.ph_widths = tf.placeholder(tf.int64, shape=(None,), name='w')
			
 
				+    self.ph_heights = tf.placeholder(tf.int64, shape=(None,), name='h')
			
 
				+    # Make actual inputs.
			
 
				+    self.in_image = _rand(self.batch_size, self.max_height, self.max_width, 3)
			
 
				+    self.in_widths = [24, 12, self.max_width, 30]
			
 
				+    self.in_heights = [self.max_height, 18, 12, 6]
			
 
				+
			
 
				+  def ExpectScaledSize(self, spec, target_shape, factor=1):
			
 
				+    """Tests that the output of the graph of the given spec has target_shape."""
			
 
				+    with tf.Graph().as_default():
			
 
				+      with self.test_session() as sess:
			
 
				+        self.SetupInputs()
			
 
				+        # Only the placeholders are given at construction time.
			
 
				+        vgsl = vgslspecs.VGSLSpecs(self.ph_widths, self.ph_heights, True)
			
 
				+        outputs = vgsl.Build(self.ph_image, spec)
			
 
				+        # Compute the expected output widths from the given scale factor.
			
 
				+        target_widths = tf.div(self.in_widths, factor).eval()
			
 
				+        target_heights = tf.div(self.in_heights, factor).eval()
			
 
				+        # Run with the 'real' data.
			
 
				+        tf.initialize_all_variables().run()
			
 
				+        res_image, res_widths, res_heights = sess.run(
			
 
				+            [outputs, vgsl.GetLengths(2), vgsl.GetLengths(1)],
			
 
				+            feed_dict={self.ph_image: self.in_image,
			
 
				+                       self.ph_widths: self.in_widths,
			
 
				+                       self.ph_heights: self.in_heights})
			
 
				+        self.assertEqual(tuple(res_image.shape), target_shape)
			
 
				+        if target_shape[1] > 1:
			
 
				+          self.assertEqual(tuple(res_heights), tuple(target_heights))
			
 
				+        if target_shape[2] > 1:
			
 
				+          self.assertEqual(tuple(res_widths), tuple(target_widths))
			
 
				+
			
 
				+  def testSameSizeConv(self):
			
 
				+    """Test all types of Conv. There is no scaling."""
			
 
				+    self.ExpectScaledSize(
			
 
				+        '[Cs{MyConv}5,5,16 Ct3,3,12 Cr4,4,24 Cl5,5,64]',
			
 
				+        (self.batch_size, self.max_height, self.max_width, 64))
			
 
				+
			
 
				+  def testSameSizeLSTM(self):
			
 
				+    """Test all non-reducing LSTMs. Output depth is doubled with BiDi."""
			
 
				+    self.ExpectScaledSize('[Lfx16 Lrx8 Do Lbx24 Lfy12 Do{MyDo} Lry7 Lby32]',
			
 
				+                          (self.batch_size, self.max_height, self.max_width,
			
 
				+                           64))
			
 
				+
			
 
				+  def testSameSizeParallel(self):
			
 
				+    """Parallel affects depth, but not scale."""
			
 
				+    self.ExpectScaledSize('[Cs5,5,16 (Lfx{MyLSTM}32 Lrx32 Lbx16)]',
			
 
				+                          (self.batch_size, self.max_height, self.max_width,
			
 
				+                           96))
			
 
				+
			
 
				+  def testScalingOps(self):
			
 
				+    """Test a heterogeneous series with scaling."""
			
 
				+    self.ExpectScaledSize('[Cs5,5,16 Mp{MyPool}2,2 Ct3,3,32 Mp3,3 Lfx32 Lry64]',
			
 
				+                          (self.batch_size, self.max_height / 6,
			
 
				+                           self.max_width / 6, 64), 6)
			
 
				+
			
 
				+  def testXReduction(self):
			
 
				+    """Test a heterogeneous series with reduction of x-dimension."""
			
 
				+    self.ExpectScaledSize('[Cr5,5,16 Mp2,2 Ct3,3,32 Mp3,3 Lfxs32 Lry64]',
			
 
				+                          (self.batch_size, self.max_height / 6, 1, 64), 6)
			
 
				+
			
 
				+  def testYReduction(self):
			
 
				+    """Test a heterogeneous series with reduction of y-dimension."""
			
 
				+    self.ExpectScaledSize('[Cl5,5,16 Mp2,2 Ct3,3,32 Mp3,3 Lfys32 Lfx64]',
			
 
				+                          (self.batch_size, 1, self.max_width / 6, 64), 6)
			
 
				+
			
 
				+  def testXYReduction(self):
			
 
				+    """Test a heterogeneous series with reduction to 0-d."""
			
 
				+    self.ExpectScaledSize(
			
 
				+        '[Cr5,5,16 Lfys32 Lfxs64 Fr{MyFC}16 Ft20 Fl12 Fs32 Fm40]',
			
 
				+        (self.batch_size, 1, 1, 40))
			
 
				+
			
 
				+  def testReshapeTile(self):
			
 
				+    """Tests that a tiled input can be reshaped to the batch dimension."""
			
 
				+    self.ExpectScaledSize('[S2(3x0)0,2 Cr5,5,16 Lfys16]',
			
 
				+                          (self.batch_size * 3, 1, self.max_width / 3, 16), 3)
			
 
				+
			
 
				+  def testReshapeDepth(self):
			
 
				+    """Tests that depth can be reshaped to the x dimension."""
			
 
				+    self.ExpectScaledSize('[Cl5,5,16 Mp3,3 (Lrys32 Lbys16 Lfys32) S3(3x0)2,3]',
			
 
				+                          (self.batch_size, 1, self.max_width, 32))
			
 
				+
			
 
				+
			
 
				+if __name__ == '__main__':
			
 
				+  tf.test.main()
			
--- a/street/testdata/arial-32-tiny
+++ b/street/testdata/arial-32-tiny
--- a/street/testdata/arial.charset_size=105.txt
+++ b/street/testdata/arial.charset_size=105.txt
@@ -0,0 +1,112 @@
 
				+0	 
			
 
				+104	<nul>
			
 
				+1	G
			
 
				+2	r
			
 
				+3	a
			
 
				+4	s
			
 
				+5	l
			
 
				+6	n
			
 
				+7	d
			
 
				+8	.
			
 
				+9	B
			
 
				+10	C
			
 
				+11	O
			
 
				+12	W
			
 
				+13	Y
			
 
				+14	,
			
 
				+15	(
			
 
				+16	u
			
 
				+17	z
			
 
				+18	i
			
 
				+19	e
			
 
				+20	)
			
 
				+21	1
			
 
				+22	9
			
 
				+23	2
			
 
				+24	-
			
 
				+25	6
			
 
				+26	o
			
 
				+27	L
			
 
				+28	P
			
 
				+29	'
			
 
				+30	t
			
 
				+31	m
			
 
				+32	K
			
 
				+33	c
			
 
				+34	k
			
 
				+35	V
			
 
				+36	S
			
 
				+37	D
			
 
				+38	J
			
 
				+39	h
			
 
				+40	M
			
 
				+41	x
			
 
				+42	E
			
 
				+43	q
			
 
				+44	;
			
 
				+45	A
			
 
				+46	y
			
 
				+47	f
			
 
				+48	5
			
 
				+49	7
			
 
				+50	b
			
 
				+51	4
			
 
				+52	0
			
 
				+53	3
			
 
				+54	N
			
 
				+55	I
			
 
				+56	T
			
 
				+57	/
			
 
				+58	p
			
 
				+59	w
			
 
				+60	g
			
 
				+61	H
			
 
				+62	“
			
 
				+63	F
			
 
				+62	”
			
 
				+62	"
			
 
				+29	’
			
 
				+64	R
			
 
				+24	—
			
 
				+65	8
			
 
				+66	v
			
 
				+67	?
			
 
				+68	é
			
 
				+69	%
			
 
				+70	:
			
 
				+71	j
			
 
				+72	\
			
 
				+73	{
			
 
				+74	}
			
 
				+75	|
			
 
				+76	U
			
 
				+77	$
			
 
				+78	°
			
 
				+79	*
			
 
				+80	!
			
 
				+81	]
			
 
				+82	Q
			
 
				+29	‘
			
 
				+83	Z
			
 
				+84	X
			
 
				+85	[
			
 
				+86	=
			
 
				+87	+
			
 
				+88	§
			
 
				+89	_
			
 
				+90	£
			
 
				+91	&
			
 
				+92	#
			
 
				+93	>
			
 
				+94	<
			
 
				+95	~
			
 
				+96	€
			
 
				+97	@
			
 
				+98	¢
			
 
				+99	»
			
 
				+100	«
			
 
				+47,5	ﬂ
			
 
				+47,18	ﬁ
			
 
				+101	®
			
 
				+102	©
			
 
				+103	¥
			
--- a/street/testdata/charset_size=134.txt
+++ b/street/testdata/charset_size=134.txt
@@ -0,0 +1,139 @@
 
				+0	 
			
 
				+133	<nul>
			
 
				+1	l
			
 
				+2	’
			
 
				+3	é
			
 
				+4	t
			
 
				+5	e
			
 
				+6	i
			
 
				+7	n
			
 
				+8	s
			
 
				+9	x
			
 
				+10	g
			
 
				+11	u
			
 
				+12	o
			
 
				+13	1
			
 
				+14	8
			
 
				+15	7
			
 
				+16	0
			
 
				+17	-
			
 
				+18	.
			
 
				+19	p
			
 
				+20	a
			
 
				+21	r
			
 
				+22	è
			
 
				+23	d
			
 
				+24	c
			
 
				+25	V
			
 
				+26	v
			
 
				+27	b
			
 
				+28	m
			
 
				+29	)
			
 
				+30	C
			
 
				+31	z
			
 
				+32	S
			
 
				+33	y
			
 
				+34	,
			
 
				+35	k
			
 
				+36	É
			
 
				+37	A
			
 
				+38	h
			
 
				+39	E
			
 
				+40	»
			
 
				+41	D
			
 
				+42	/
			
 
				+43	H
			
 
				+44	M
			
 
				+45	(
			
 
				+46	G
			
 
				+47	P
			
 
				+48	ç
			
 
				+2	'
			
 
				+49	R
			
 
				+50	f
			
 
				+51	"
			
 
				+52	2
			
 
				+53	j
			
 
				+54	|
			
 
				+55	N
			
 
				+56	6
			
 
				+57	°
			
 
				+58	5
			
 
				+59	T
			
 
				+60	O
			
 
				+61	U
			
 
				+62	3
			
 
				+63	%
			
 
				+64	9
			
 
				+65	q
			
 
				+66	Z
			
 
				+67	B
			
 
				+68	K
			
 
				+69	w
			
 
				+70	W
			
 
				+71	:
			
 
				+72	4
			
 
				+73	L
			
 
				+74	F
			
 
				+75	]
			
 
				+76	ï
			
 
				+2	‘
			
 
				+77	I
			
 
				+78	J
			
 
				+79	ä
			
 
				+80	î
			
 
				+81	;
			
 
				+82	à
			
 
				+83	ê
			
 
				+84	X
			
 
				+85	ü
			
 
				+86	Y
			
 
				+87	ô
			
 
				+88	=
			
 
				+89	+
			
 
				+90	\
			
 
				+91	{
			
 
				+92	}
			
 
				+93	_
			
 
				+94	Q
			
 
				+95	œ
			
 
				+96	ñ
			
 
				+97	*
			
 
				+98	!
			
 
				+99	Ü
			
 
				+51	“
			
 
				+100	â
			
 
				+101	Ç
			
 
				+102	Œ
			
 
				+103	û
			
 
				+104	?
			
 
				+105	$
			
 
				+106	ë
			
 
				+107	«
			
 
				+108	€
			
 
				+109	&
			
 
				+110	<
			
 
				+51	”
			
 
				+111	æ
			
 
				+112	#
			
 
				+113	®
			
 
				+114	Â
			
 
				+115	È
			
 
				+116	>
			
 
				+117	[
			
 
				+17	—
			
 
				+118	Æ
			
 
				+119	ù
			
 
				+120	Î
			
 
				+121	Ô
			
 
				+122	ÿ
			
 
				+123	À
			
 
				+124	Ê
			
 
				+125	@
			
 
				+126	Ï
			
 
				+127	©
			
 
				+128	Ë
			
 
				+129	Ù
			
 
				+130	£
			
 
				+131	Ÿ
			
 
				+132	Û
			
--- a/street/testdata/charset_size_10.txt
+++ b/street/testdata/charset_size_10.txt
@@ -0,0 +1,10 @@
 
				+0	 
			
 
				+9	<nul>
			
 
				+1	a
			
 
				+2	b
			
 
				+3	r
			
 
				+4	n
			
 
				+4,5	m
			
 
				+6	f
			
 
				+7	.
			
 
				+8	,
			
--- a/street/testdata/mnist-tiny
+++ b/street/testdata/mnist-tiny
--- a/street/testdata/numbers-16-tiny
+++ b/street/testdata/numbers-16-tiny
--- a/street/testdata/numbers.charset_size=12.txt
+++ b/street/testdata/numbers.charset_size=12.txt
@@ -0,0 +1,12 @@
 
				+0	 
			
 
				+11	<nul>
			
 
				+1	9
			
 
				+2	8
			
 
				+3	7
			
 
				+4	6
			
 
				+5	1
			
 
				+6	4
			
 
				+7	0
			
 
				+8	3
			
 
				+9	5
			
 
				+10	2