contrib.rnn.md

RNN and Cells (contrib)

[TOC]

Module for constructing RNN Cells and additional RNN operations.

Base interface for all RNN Cells

`class tf.contrib.rnn.RNNCell` {#RNNCell}

Abstract object representing an RNN cell.

The definition of cell in this package differs from the definition used in the literature. In the literature, cell refers to an object with a single scalar output. The definition in this package refers to a horizontal array of such units.

An RNN cell, in the most abstract setting, is anything that has a state and performs some operation that takes a matrix of inputs. This operation results in an output matrix with self.output_size columns. If self.state_size is an integer, this operation also results in a new state matrix with self.state_size columns. If self.state_size is a tuple of integers, then it results in a tuple of len(state_size) state matrices, each with a column size corresponding to values in state_size.

This module provides a number of basic commonly used RNN cells, such as LSTM (Long Short Term Memory) or GRU (Gated Recurrent Unit), and a number of operators that allow add dropouts, projections, or embeddings for inputs. Constructing multi-layer cells is supported by the class MultiRNNCell, or by calling the rnn ops several times. Every RNNCell must have the properties below and implement __call__ with the following signature.

`tf.contrib.rnn.RNNCell.call(inputs, state, scope=None)` {#RNNCell.call}

Run this RNN cell on inputs, starting from the given state.

Args:

inputs: 2-D tensor with shape [batch_size x input_size].
state: if self.state_size is an integer, this should be a 2-D Tensor with shape [batch_size x self.state_size]. Otherwise, if self.state_size is a tuple of integers, this should be a tuple with shapes [batch_size x s] for s in self.state_size.
scope: VariableScope for the created subgraph; defaults to class name.

Returns:

A pair containing:

Output: A 2-D tensor with shape [batch_size x self.output_size].
New state: Either a single 2-D tensor, or a tuple of tensors matching the arity and shapes of state.

`tf.contrib.rnn.RNNCell.output_size` {#RNNCell.output_size}

Integer or TensorShape: size of outputs produced by this cell.

`tf.contrib.rnn.RNNCell.state_size` {#RNNCell.state_size}

size(s) of state(s) used by this cell.

It can be represented by an Integer, a TensorShape or a tuple of Integers or TensorShapes.

`tf.contrib.rnn.RNNCell.zero_state(batch_size, dtype)` {#RNNCell.zero_state}

Return zero-filled state tensor(s).

Args:

batch_size: int, float, or unit Tensor representing the batch size.
dtype: the data type to use for the state.

Returns:

If state_size is an int or TensorShape, then the return value is a N-D tensor of shape [batch_size x state_size] filled with zeros.

If state_size is a nested list or tuple, then the return value is a nested list or tuple (of the same structure) of 2-D tensors with the shapes [batch_size x s] for each s in state_size.

RNN Cells for use with TensorFlow's core RNN methods

`class tf.contrib.rnn.BasicRNNCell` {#BasicRNNCell}

The most basic RNN cell.

`tf.contrib.rnn.BasicRNNCell.call(inputs, state, scope=None)` {#BasicRNNCell.call}

Most basic RNN: output = new_state = act(W * input + U * state + B).

`tf.contrib.rnn.BasicRNNCell.init(num_units, input_size=None, activation=tanh)` {#BasicRNNCell.init}

`tf.contrib.rnn.BasicRNNCell.output_size` {#BasicRNNCell.output_size}

`tf.contrib.rnn.BasicRNNCell.state_size` {#BasicRNNCell.state_size}

`tf.contrib.rnn.BasicRNNCell.zero_state(batch_size, dtype)` {#BasicRNNCell.zero_state}

Return zero-filled state tensor(s).

Args:

batch_size: int, float, or unit Tensor representing the batch size.
dtype: the data type to use for the state.

Returns:

If state_size is an int or TensorShape, then the return value is a N-D tensor of shape [batch_size x state_size] filled with zeros.

If state_size is a nested list or tuple, then the return value is a nested list or tuple (of the same structure) of 2-D tensors with the shapes [batch_size x s] for each s in state_size.

`class tf.contrib.rnn.BasicLSTMCell` {#BasicLSTMCell}

Basic LSTM recurrent network cell.

The implementation is based on: http://arxiv.org/abs/1409.2329.

We add forget_bias (default: 1) to the biases of the forget gate in order to reduce the scale of forgetting in the beginning of the training.

It does not allow cell clipping, a projection layer, and does not use peep-hole connections: it is the basic baseline.

For advanced models, please use the full LSTMCell that follows.

`tf.contrib.rnn.BasicLSTMCell.call(inputs, state, scope=None)` {#BasicLSTMCell.call}

Long short-term memory cell (LSTM).

`tf.contrib.rnn.BasicLSTMCell.init(num_units, forget_bias=1.0, input_size=None, state_is_tuple=True, activation=tanh)` {#BasicLSTMCell.init}

Initialize the basic LSTM cell.

Args:

num_units: int, The number of units in the LSTM cell.
forget_bias: float, The bias added to forget gates (see above).
input_size: Deprecated and unused.
state_is_tuple: If True, accepted and returned states are 2-tuples of the c_state and m_state. If False, they are concatenated along the column axis. The latter behavior will soon be deprecated.
activation: Activation function of the inner states.

`tf.contrib.rnn.BasicLSTMCell.output_size` {#BasicLSTMCell.output_size}

`tf.contrib.rnn.BasicLSTMCell.state_size` {#BasicLSTMCell.state_size}

`tf.contrib.rnn.BasicLSTMCell.zero_state(batch_size, dtype)` {#BasicLSTMCell.zero_state}

Return zero-filled state tensor(s).

Args:

batch_size: int, float, or unit Tensor representing the batch size.
dtype: the data type to use for the state.

Returns:

If state_size is an int or TensorShape, then the return value is a N-D tensor of shape [batch_size x state_size] filled with zeros.

If state_size is a nested list or tuple, then the return value is a nested list or tuple (of the same structure) of 2-D tensors with the shapes [batch_size x s] for each s in state_size.

`class tf.contrib.rnn.GRUCell` {#GRUCell}

Gated Recurrent Unit cell (cf. http://arxiv.org/abs/1406.1078).

`tf.contrib.rnn.GRUCell.call(inputs, state, scope=None)` {#GRUCell.call}

Gated recurrent unit (GRU) with nunits cells.

`tf.contrib.rnn.GRUCell.init(num_units, input_size=None, activation=tanh)` {#GRUCell.init}

`tf.contrib.rnn.GRUCell.output_size` {#GRUCell.output_size}

`tf.contrib.rnn.GRUCell.state_size` {#GRUCell.state_size}

`tf.contrib.rnn.GRUCell.zero_state(batch_size, dtype)` {#GRUCell.zero_state}

Return zero-filled state tensor(s).

Args:

batch_size: int, float, or unit Tensor representing the batch size.
dtype: the data type to use for the state.

Returns:

If state_size is an int or TensorShape, then the return value is a N-D tensor of shape [batch_size x state_size] filled with zeros.

If state_size is a nested list or tuple, then the return value is a nested list or tuple (of the same structure) of 2-D tensors with the shapes [batch_size x s] for each s in state_size.

`class tf.contrib.rnn.LSTMCell` {#LSTMCell}

Long short-term memory unit (LSTM) recurrent network cell.

The default non-peephole implementation is based on:

http://deeplearning.cs.cmu.edu/pdfs/Hochreiter97_lstm.pdf

S. Hochreiter and J. Schmidhuber. "Long Short-Term Memory". Neural Computation, 9(8):1735-1780, 1997.

The peephole implementation is based on:

https://research.google.com/pubs/archive/43905.pdf

Hasim Sak, Andrew Senior, and Francoise Beaufays. "Long short-term memory recurrent neural network architectures for large scale acoustic modeling." INTERSPEECH, 2014.

The class uses optional peep-hole connections, optional cell clipping, and an optional projection layer.

`tf.contrib.rnn.LSTMCell.call(inputs, state, scope=None)` {#LSTMCell.call}

Run one step of LSTM.

Args:

inputs: input Tensor, 2D, batch x num_units.
state: if state_is_tuple is False, this must be a state Tensor, 2-D, batch x state_size. If state_is_tuple is True, this must be a tuple of state Tensors, both 2-D, with column sizes c_state and m_state.
scope: VariableScope for the created subgraph; defaults to "lstm_cell".

Returns:

A tuple containing:

A 2-D, [batch x output_dim], Tensor representing the output of the LSTM after reading inputs when previous state was state. Here output_dim is: num_proj if num_proj was set, num_units otherwise.
Tensor(s) representing the new state of LSTM after reading inputs when the previous state was state. Same type and shape(s) as state.

Raises:

ValueError: If input size cannot be inferred from inputs via static shape inference.

`tf.contrib.rnn.LSTMCell.init(num_units, input_size=None, use_peepholes=False, cell_clip=None, initializer=None, num_proj=None, proj_clip=None, num_unit_shards=None, num_proj_shards=None, forget_bias=1.0, state_is_tuple=True, activation=tanh)` {#LSTMCell.init}

Initialize the parameters for an LSTM cell.

Args:

num_units: int, The number of units in the LSTM cell
input_size: Deprecated and unused.
use_peepholes: bool, set True to enable diagonal/peephole connections.
cell_clip: (optional) A float value, if provided the cell state is clipped by this value prior to the cell output activation.
initializer: (optional) The initializer to use for the weight and projection matrices.
num_proj: (optional) int, The output dimensionality for the projection matrices. If None, no projection is performed.
proj_clip: (optional) A float value. If num_proj > 0 and proj_clip is provided, then the projected values are clipped elementwise to within [-proj_clip, proj_clip].
num_unit_shards: Deprecated, will be removed by Jan. 2017. Use a variable_scope partitioner instead.
num_proj_shards: Deprecated, will be removed by Jan. 2017. Use a variable_scope partitioner instead.
forget_bias: Biases of the forget gate are initialized by default to 1 in order to reduce the scale of forgetting at the beginning of the training.
state_is_tuple: If True, accepted and returned states are 2-tuples of the c_state and m_state. If False, they are concatenated along the column axis. This latter behavior will soon be deprecated.
activation: Activation function of the inner states.

`tf.contrib.rnn.LSTMCell.output_size` {#LSTMCell.output_size}

`tf.contrib.rnn.LSTMCell.state_size` {#LSTMCell.state_size}

`tf.contrib.rnn.LSTMCell.zero_state(batch_size, dtype)` {#LSTMCell.zero_state}

Return zero-filled state tensor(s).

Args:

batch_size: int, float, or unit Tensor representing the batch size.
dtype: the data type to use for the state.

Returns:

If state_size is an int or TensorShape, then the return value is a N-D tensor of shape [batch_size x state_size] filled with zeros.

If state_size is a nested list or tuple, then the return value is a nested list or tuple (of the same structure) of 2-D tensors with the shapes [batch_size x s] for each s in state_size.

Classes storing split `RNNCell` state

`class tf.contrib.rnn.LSTMStateTuple` {#LSTMStateTuple}

Tuple used by LSTM Cells for state_size, zero_state, and output state.

Stores two elements: (c, h), in that order.

Only used when state_is_tuple=True.

`tf.contrib.rnn.LSTMStateTuple.getnewargs()` {#LSTMStateTuple.getnewargs}

Return self as a plain tuple. Used by copy and pickle.

`tf.contrib.rnn.LSTMStateTuple.getstate()` {#LSTMStateTuple.getstate}

Exclude the OrderedDict from pickling

`tf.contrib.rnn.LSTMStateTuple.new(_cls, c, h)` {#LSTMStateTuple.new}

Create new instance of LSTMStateTuple(c, h)

`tf.contrib.rnn.LSTMStateTuple.repr()` {#LSTMStateTuple.repr}

Return a nicely formatted representation string

`tf.contrib.rnn.LSTMStateTuple.c` {#LSTMStateTuple.c}

Alias for field number 0

`tf.contrib.rnn.LSTMStateTuple.dtype` {#LSTMStateTuple.dtype}

`tf.contrib.rnn.LSTMStateTuple.h` {#LSTMStateTuple.h}

Alias for field number 1

RNN Cell wrappers (RNNCells that wrap other RNNCells)

`class tf.contrib.rnn.MultiRNNCell` {#MultiRNNCell}

RNN cell composed sequentially of multiple simple cells.

`tf.contrib.rnn.MultiRNNCell.call(inputs, state, scope=None)` {#MultiRNNCell.call}

Run this multi-layer cell on inputs, starting from state.

`tf.contrib.rnn.MultiRNNCell.init(cells, state_is_tuple=True)` {#MultiRNNCell.init}

Create a RNN cell composed sequentially of a number of RNNCells.

Args:

cells: list of RNNCells that will be composed in this order.
state_is_tuple: If True, accepted and returned states are n-tuples, where n = len(cells). If False, the states are all concatenated along the column axis. This latter behavior will soon be deprecated.

Raises:

ValueError: if cells is empty (not allowed), or at least one of the cells returns a state tuple but the flag state_is_tuple is False.

`tf.contrib.rnn.MultiRNNCell.output_size` {#MultiRNNCell.output_size}

`tf.contrib.rnn.MultiRNNCell.state_size` {#MultiRNNCell.state_size}

`tf.contrib.rnn.MultiRNNCell.zero_state(batch_size, dtype)` {#MultiRNNCell.zero_state}

Return zero-filled state tensor(s).

Args:

batch_size: int, float, or unit Tensor representing the batch size.
dtype: the data type to use for the state.

Returns:

If state_size is an int or TensorShape, then the return value is a N-D tensor of shape [batch_size x state_size] filled with zeros.

If state_size is a nested list or tuple, then the return value is a nested list or tuple (of the same structure) of 2-D tensors with the shapes [batch_size x s] for each s in state_size.

`class tf.contrib.rnn.DropoutWrapper` {#DropoutWrapper}

Operator adding dropout to inputs and outputs of the given cell.

`tf.contrib.rnn.DropoutWrapper.call(inputs, state, scope=None)` {#DropoutWrapper.call}

Run the cell with the declared dropouts.

`tf.contrib.rnn.DropoutWrapper.init(cell, input_keep_prob=1.0, output_keep_prob=1.0, seed=None)` {#DropoutWrapper.init}

Create a cell with added input and/or output dropout.

Dropout is never used on the state.

Args:

cell: an RNNCell, a projection to output_size is added to it.
input_keep_prob: unit Tensor or float between 0 and 1, input keep probability; if it is float and 1, no input dropout will be added.
output_keep_prob: unit Tensor or float between 0 and 1, output keep probability; if it is float and 1, no output dropout will be added.
seed: (optional) integer, the randomness seed.

Raises:

TypeError: if cell is not an RNNCell.
ValueError: if keep_prob is not between 0 and 1.

`tf.contrib.rnn.DropoutWrapper.output_size` {#DropoutWrapper.output_size}

`tf.contrib.rnn.DropoutWrapper.state_size` {#DropoutWrapper.state_size}

`tf.contrib.rnn.DropoutWrapper.zero_state(batch_size, dtype)` {#DropoutWrapper.zero_state}

Return zero-filled state tensor(s).

Args:

batch_size: int, float, or unit Tensor representing the batch size.
dtype: the data type to use for the state.

Returns:

If state_size is an int or TensorShape, then the return value is a N-D tensor of shape [batch_size x state_size] filled with zeros.

If state_size is a nested list or tuple, then the return value is a nested list or tuple (of the same structure) of 2-D tensors with the shapes [batch_size x s] for each s in state_size.

`class tf.contrib.rnn.EmbeddingWrapper` {#EmbeddingWrapper}

Operator adding input embedding to the given cell.

Note: in many cases it may be more efficient to not use this wrapper, but instead concatenate the whole sequence of your inputs in time, do the embedding on this batch-concatenated sequence, then split it and feed into your RNN.

`tf.contrib.rnn.EmbeddingWrapper.call(inputs, state, scope=None)` {#EmbeddingWrapper.call}

Run the cell on embedded inputs.

`tf.contrib.rnn.EmbeddingWrapper.init(cell, embedding_classes, embedding_size, initializer=None)` {#EmbeddingWrapper.init}

Create a cell with an added input embedding.

Args:

cell: an RNNCell, an embedding will be put before its inputs.
embedding_classes: integer, how many symbols will be embedded.
embedding_size: integer, the size of the vectors we embed into.
initializer: an initializer to use when creating the embedding; if None, the initializer from variable scope or a default one is used.

Raises:

TypeError: if cell is not an RNNCell.
ValueError: if embedding_classes is not positive.

`tf.contrib.rnn.EmbeddingWrapper.output_size` {#EmbeddingWrapper.output_size}

`tf.contrib.rnn.EmbeddingWrapper.state_size` {#EmbeddingWrapper.state_size}

`tf.contrib.rnn.EmbeddingWrapper.zero_state(batch_size, dtype)` {#EmbeddingWrapper.zero_state}

Return zero-filled state tensor(s).

Args:

batch_size: int, float, or unit Tensor representing the batch size.
dtype: the data type to use for the state.

Returns:

If state_size is an int or TensorShape, then the return value is a N-D tensor of shape [batch_size x state_size] filled with zeros.

If state_size is a nested list or tuple, then the return value is a nested list or tuple (of the same structure) of 2-D tensors with the shapes [batch_size x s] for each s in state_size.

`class tf.contrib.rnn.InputProjectionWrapper` {#InputProjectionWrapper}

Operator adding an input projection to the given cell.

Note: in many cases it may be more efficient to not use this wrapper, but instead concatenate the whole sequence of your inputs in time, do the projection on this batch-concatenated sequence, then split it.

`tf.contrib.rnn.InputProjectionWrapper.call(inputs, state, scope=None)` {#InputProjectionWrapper.call}

Run the input projection and then the cell.

`tf.contrib.rnn.InputProjectionWrapper.init(cell, num_proj, input_size=None)` {#InputProjectionWrapper.init}

Create a cell with input projection.

Args:

cell: an RNNCell, a projection of inputs is added before it.
num_proj: Python integer. The dimension to project to.
input_size: Deprecated and unused.

Raises:

TypeError: if cell is not an RNNCell.

`tf.contrib.rnn.InputProjectionWrapper.output_size` {#InputProjectionWrapper.output_size}

`tf.contrib.rnn.InputProjectionWrapper.state_size` {#InputProjectionWrapper.state_size}

`tf.contrib.rnn.InputProjectionWrapper.zero_state(batch_size, dtype)` {#InputProjectionWrapper.zero_state}

Return zero-filled state tensor(s).

Args:

batch_size: int, float, or unit Tensor representing the batch size.
dtype: the data type to use for the state.

Returns:

If state_size is an int or TensorShape, then the return value is a N-D tensor of shape [batch_size x state_size] filled with zeros.

If state_size is a nested list or tuple, then the return value is a nested list or tuple (of the same structure) of 2-D tensors with the shapes [batch_size x s] for each s in state_size.

`class tf.contrib.rnn.OutputProjectionWrapper` {#OutputProjectionWrapper}

Operator adding an output projection to the given cell.

Note: in many cases it may be more efficient to not use this wrapper, but instead concatenate the whole sequence of your outputs in time, do the projection on this batch-concatenated sequence, then split it if needed or directly feed into a softmax.

`tf.contrib.rnn.OutputProjectionWrapper.call(inputs, state, scope=None)` {#OutputProjectionWrapper.call}

Run the cell and output projection on inputs, starting from state.

`tf.contrib.rnn.OutputProjectionWrapper.init(cell, output_size)` {#OutputProjectionWrapper.init}

Create a cell with output projection.

Args:

cell: an RNNCell, a projection to output_size is added to it.
output_size: integer, the size of the output after projection.

Raises:

TypeError: if cell is not an RNNCell.
ValueError: if output_size is not positive.

`tf.contrib.rnn.OutputProjectionWrapper.output_size` {#OutputProjectionWrapper.output_size}

`tf.contrib.rnn.OutputProjectionWrapper.state_size` {#OutputProjectionWrapper.state_size}

`tf.contrib.rnn.OutputProjectionWrapper.zero_state(batch_size, dtype)` {#OutputProjectionWrapper.zero_state}

Return zero-filled state tensor(s).

Args:

batch_size: int, float, or unit Tensor representing the batch size.
dtype: the data type to use for the state.

Returns:

If state_size is an int or TensorShape, then the return value is a N-D tensor of shape [batch_size x state_size] filled with zeros.

If state_size is a nested list or tuple, then the return value is a nested list or tuple (of the same structure) of 2-D tensors with the shapes [batch_size x s] for each s in state_size.

Block RNNCells

`class tf.contrib.rnn.LSTMBlockCell` {#LSTMBlockCell}

Basic LSTM recurrent network cell.

The implementation is based on: http://arxiv.org/abs/1409.2329.

We add forget_bias (default: 1) to the biases of the forget gate in order to reduce the scale of forgetting in the beginning of the training.

Unlike core_rnn_cell.LSTMCell, this is a monolithic op and should be much faster. The weight and bias matrixes should be compatible as long as the variable scope matches.

`tf.contrib.rnn.LSTMBlockCell.call(x, states_prev, scope=None)` {#LSTMBlockCell.call}

Long short-term memory cell (LSTM).

`tf.contrib.rnn.LSTMBlockCell.init(num_units, forget_bias=1.0, use_peephole=False)` {#LSTMBlockCell.init}

Initialize the basic LSTM cell.

Args:

num_units: int, The number of units in the LSTM cell.
forget_bias: float, The bias added to forget gates (see above).
use_peephole: Whether to use peephole connections or not.

`tf.contrib.rnn.LSTMBlockCell.output_size` {#LSTMBlockCell.output_size}

`tf.contrib.rnn.LSTMBlockCell.state_size` {#LSTMBlockCell.state_size}

`tf.contrib.rnn.LSTMBlockCell.zero_state(batch_size, dtype)` {#LSTMBlockCell.zero_state}

Return zero-filled state tensor(s).

Args:

batch_size: int, float, or unit Tensor representing the batch size.
dtype: the data type to use for the state.

Returns:

If state_size is an int or TensorShape, then the return value is a N-D tensor of shape [batch_size x state_size] filled with zeros.

If state_size is a nested list or tuple, then the return value is a nested list or tuple (of the same structure) of 2-D tensors with the shapes [batch_size x s] for each s in state_size.

`class tf.contrib.rnn.GRUBlockCell` {#GRUBlockCell}

Block GRU cell implementation.

The implementation is based on: http://arxiv.org/abs/1406.1078 Computes the LSTM cell forward propagation for 1 time step.

This kernel op implements the following mathematical equations:

Biases are initialized with:

b_ru - constant_initializer(1.0)
b_c - constant_initializer(0.0)

x_h_prev = [x, h_prev]

[r_bar u_bar] = x_h_prev * w_ru + b_ru

r = sigmoid(r_bar)
u = sigmoid(u_bar)

h_prevr = h_prev \circ r

x_h_prevr = [x h_prevr]

c_bar = x_h_prevr * w_c + b_c
c = tanh(c_bar)

h = (1-u) \circ c + u \circ h_prev

`tf.contrib.rnn.GRUBlockCell.call(x, h_prev, scope=None)` {#GRUBlockCell.call}

GRU cell.

`tf.contrib.rnn.GRUBlockCell.init(cell_size)` {#GRUBlockCell.init}

Initialize the Block GRU cell.

Args:

cell_size: int, GRU cell size.

`tf.contrib.rnn.GRUBlockCell.output_size` {#GRUBlockCell.output_size}

`tf.contrib.rnn.GRUBlockCell.state_size` {#GRUBlockCell.state_size}

`tf.contrib.rnn.GRUBlockCell.zero_state(batch_size, dtype)` {#GRUBlockCell.zero_state}

Return zero-filled state tensor(s).

Args:

batch_size: int, float, or unit Tensor representing the batch size.
dtype: the data type to use for the state.

Returns:

If state_size is an int or TensorShape, then the return value is a N-D tensor of shape [batch_size x state_size] filled with zeros.

If state_size is a nested list or tuple, then the return value is a nested list or tuple (of the same structure) of 2-D tensors with the shapes [batch_size x s] for each s in state_size.

Fused RNNCells

`class tf.contrib.rnn.FusedRNNCell` {#FusedRNNCell}

Abstract object representing a fused RNN cell.

A fused RNN cell represents the entire RNN expanded over the time dimension. In effect, this represents an entire recurrent network.

Unlike RNN cells which are subclasses of rnn_cell.RNNCell, a FusedRNNCell operates on the entire time sequence at once, by putting the loop over time inside the cell. This usually leads to much more efficient, but more complex and less flexible implementations.

Every FusedRNNCell must implement __call__ with the following signature.

`tf.contrib.rnn.FusedRNNCell.call(inputs, initial_state=None, dtype=None, sequence_length=None, scope=None)` {#FusedRNNCell.call}

Run this fused RNN on inputs, starting from the given state.

Args:

inputs: 3-D tensor with shape [time_len x batch_size x input_size] or a list of time_len tensors of shape [batch_size x input_size].
initial_state: either a tensor with shape [batch_size x state_size] or a tuple with shapes [batch_size x s] for s in state_size, if the cell takes tuples. If this is not provided, the cell is expected to create a zero initial state of type dtype.
dtype: The data type for the initial state and expected output. Required if initial_state is not provided or RNN state has a heterogeneous dtype.
sequence_length: Specifies the length of each sequence in inputs. An int32 or int64 vector (tensor) size [batch_size], values in [0, time_len). Defaults to time_len for each element.
scope: VariableScope or string for the created subgraph; defaults to class name.

Returns:

A pair containing:

Output: A 3-D tensor of shape [time_len x batch_size x output_size] or a list of time_len tensors of shape [batch_size x output_size], to match the type of the inputs.
Final state: Either a single 2-D tensor, or a tuple of tensors matching the arity and shapes of initial_state.

`class tf.contrib.rnn.FusedRNNCellAdaptor` {#FusedRNNCellAdaptor}

This is an adaptor for RNNCell classes to be used with FusedRNNCell.

`tf.contrib.rnn.FusedRNNCellAdaptor.call(inputs, initial_state=None, dtype=None, sequence_length=None, scope=None)` {#FusedRNNCellAdaptor.call}

`tf.contrib.rnn.FusedRNNCellAdaptor.init(cell, use_dynamic_rnn=False)` {#FusedRNNCellAdaptor.init}

Initialize the adaptor.

Args:

cell: an instance of a subclass of a rnn_cell.RNNCell.
use_dynamic_rnn: whether to use dynamic (or static) RNN.

`class tf.contrib.rnn.TimeReversedFusedRNN` {#TimeReversedFusedRNN}

This is an adaptor to time-reverse a FusedRNNCell.

For example,

cell = tf.contrib.rnn.BasicRNNCell(10)
fw_lstm = tf.contrib.rnn.FusedRNNCellAdaptor(cell, use_dynamic_rnn=True)
bw_lstm = tf.contrib.rnn.TimeReversedFusedRNN(fw_lstm)
fw_out, fw_state = fw_lstm(inputs)
bw_out, bw_state = bw_lstm(inputs)

`tf.contrib.rnn.TimeReversedFusedRNN.call(inputs, initial_state=None, dtype=None, sequence_length=None, scope=None)` {#TimeReversedFusedRNN.call}

`tf.contrib.rnn.TimeReversedFusedRNN.init(cell)` {#TimeReversedFusedRNN.init}

`class tf.contrib.rnn.LSTMBlockFusedCell` {#LSTMBlockFusedCell}

FusedRNNCell implementation of LSTM.

This is an extremely efficient LSTM implementation, that uses a single TF op for the entire LSTM. It should be both faster and more memory-efficient than LSTMBlockCell defined above.

The implementation is based on: http://arxiv.org/abs/1409.2329.

We add forget_bias (default: 1) to the biases of the forget gate in order to reduce the scale of forgetting in the beginning of the training.

The variable naming is consistent with core_rnn_cell.LSTMCell.

`tf.contrib.rnn.LSTMBlockFusedCell.call(inputs, initial_state=None, dtype=None, sequence_length=None, scope=None)` {#LSTMBlockFusedCell.call}

Run this LSTM on inputs, starting from the given state.

Args:

inputs: 3-D tensor with shape [time_len, batch_size, input_size] or a list of time_len tensors of shape [batch_size, input_size].
initial_state: a tuple (initial_cell_state, initial_output) with tensors of shape [batch_size, self._num_units]. If this is not provided, the cell is expected to create a zero initial state of type dtype.
dtype: The data type for the initial state and expected output. Required if initial_state is not provided or RNN state has a heterogeneous dtype.
sequence_length: Specifies the length of each sequence in inputs. An int32 or int64 vector (tensor) size [batch_size], values in [0, time_len). Defaults to time_len for each element.
scope: VariableScope for the created subgraph; defaults to class name.

Returns:

A pair containing:

Output: A 3-D tensor of shape [time_len, batch_size, output_size] or a list of time_len tensors of shape [batch_size, output_size], to match the type of the inputs.
Final state: a tuple (cell_state, output) matching initial_state.

Raises:

ValueError: in case of shape mismatches

`tf.contrib.rnn.LSTMBlockFusedCell.init(num_units, forget_bias=1.0, cell_clip=None, use_peephole=False)` {#LSTMBlockFusedCell.init}

Initialize the LSTM cell.

Args:

num_units: int, The number of units in the LSTM cell.
forget_bias: float, The bias added to forget gates (see above).
cell_clip: clip the cell to this value. Defaults to 3.
use_peephole: Whether to use peephole connections or not.

`tf.contrib.rnn.LSTMBlockFusedCell.num_units` {#LSTMBlockFusedCell.num_units}

Number of units in this cell (output dimension).

LSTM-like cells

`class tf.contrib.rnn.CoupledInputForgetGateLSTMCell` {#CoupledInputForgetGateLSTMCell}

Long short-term memory unit (LSTM) recurrent network cell.

The default non-peephole implementation is based on:

http://deeplearning.cs.cmu.edu/pdfs/Hochreiter97_lstm.pdf

S. Hochreiter and J. Schmidhuber. "Long Short-Term Memory". Neural Computation, 9(8):1735-1780, 1997.

The peephole implementation is based on:

https://research.google.com/pubs/archive/43905.pdf

Hasim Sak, Andrew Senior, and Francoise Beaufays. "Long short-term memory recurrent neural network architectures for large scale acoustic modeling." INTERSPEECH, 2014.

The coupling of input and forget gate is based on:

http://arxiv.org/pdf/1503.04069.pdf

Greff et al. "LSTM: A Search Space Odyssey"

The class uses optional peep-hole connections, and an optional projection layer.

`tf.contrib.rnn.CoupledInputForgetGateLSTMCell.call(inputs, state, scope=None)` {#CoupledInputForgetGateLSTMCell.call}

Run one step of LSTM.

Args:

inputs: input Tensor, 2D, batch x num_units.
state: if state_is_tuple is False, this must be a state Tensor, 2-D, batch x state_size. If state_is_tuple is True, this must be a tuple of state Tensors, both 2-D, with column sizes c_state and m_state.
scope: VariableScope for the created subgraph; defaults to "LSTMCell".

Returns:

A tuple containing:

A 2-D, [batch x output_dim], Tensor representing the output of the LSTM after reading inputs when previous state was state. Here output_dim is: num_proj if num_proj was set, num_units otherwise.
Tensor(s) representing the new state of LSTM after reading inputs when the previous state was state. Same type and shape(s) as state.

Raises:

ValueError: If input size cannot be inferred from inputs via static shape inference.

`tf.contrib.rnn.CoupledInputForgetGateLSTMCell.init(num_units, use_peepholes=False, initializer=None, num_proj=None, proj_clip=None, num_unit_shards=1, num_proj_shards=1, forget_bias=1.0, state_is_tuple=False, activation=tanh)` {#CoupledInputForgetGateLSTMCell.init}

Initialize the parameters for an LSTM cell.

Args:

num_units: int, The number of units in the LSTM cell
use_peepholes: bool, set True to enable diagonal/peephole connections.
initializer: (optional) The initializer to use for the weight and projection matrices.
num_proj: (optional) int, The output dimensionality for the projection matrices. If None, no projection is performed.
proj_clip: (optional) A float value. If num_proj > 0 and proj_clip is provided, then the projected values are clipped elementwise to within [-proj_clip, proj_clip].
num_unit_shards: How to split the weight matrix. If >1, the weight matrix is stored across num_unit_shards.
num_proj_shards: How to split the projection matrix. If >1, the projection matrix is stored across num_proj_shards.
forget_bias: Biases of the forget gate are initialized by default to 1 in order to reduce the scale of forgetting at the beginning of the training.
state_is_tuple: If True, accepted and returned states are 2-tuples of the c_state and m_state. By default (False), they are concatenated along the column axis. This default behavior will soon be deprecated.
activation: Activation function of the inner states.

`tf.contrib.rnn.CoupledInputForgetGateLSTMCell.output_size` {#CoupledInputForgetGateLSTMCell.output_size}

`tf.contrib.rnn.CoupledInputForgetGateLSTMCell.state_size` {#CoupledInputForgetGateLSTMCell.state_size}

`tf.contrib.rnn.CoupledInputForgetGateLSTMCell.zero_state(batch_size, dtype)` {#CoupledInputForgetGateLSTMCell.zero_state}

Return zero-filled state tensor(s).

Args:

batch_size: int, float, or unit Tensor representing the batch size.
dtype: the data type to use for the state.

Returns:

If state_size is an int or TensorShape, then the return value is a N-D tensor of shape [batch_size x state_size] filled with zeros.

If state_size is a nested list or tuple, then the return value is a nested list or tuple (of the same structure) of 2-D tensors with the shapes [batch_size x s] for each s in state_size.

`class tf.contrib.rnn.TimeFreqLSTMCell` {#TimeFreqLSTMCell}

Time-Frequency Long short-term memory unit (LSTM) recurrent network cell.

This implementation is based on:

Tara N. Sainath and Bo Li "Modeling Time-Frequency Patterns with LSTM vs. Convolutional Architectures for LVCSR Tasks." submitted to INTERSPEECH, 2016.

It uses peep-hole connections and optional cell clipping.

`tf.contrib.rnn.TimeFreqLSTMCell.call(inputs, state, scope=None)` {#TimeFreqLSTMCell.call}

Run one step of LSTM.

Args:

inputs: input Tensor, 2D, batch x num_units.
state: state Tensor, 2D, batch x state_size.
scope: VariableScope for the created subgraph; defaults to "TimeFreqLSTMCell".

Returns:

A tuple containing:

A 2D, batch x output_dim, Tensor representing the output of the LSTM after reading "inputs" when previous state was "state". Here output_dim is num_units.
A 2D, batch x state_size, Tensor representing the new state of LSTM after reading "inputs" when previous state was "state".

Raises:

ValueError: if an input_size was specified and the provided inputs have a different dimension.

`tf.contrib.rnn.TimeFreqLSTMCell.init(num_units, use_peepholes=False, cell_clip=None, initializer=None, num_unit_shards=1, forget_bias=1.0, feature_size=None, frequency_skip=None)` {#TimeFreqLSTMCell.init}

Initialize the parameters for an LSTM cell.

Args:

num_units: int, The number of units in the LSTM cell
use_peepholes: bool, set True to enable diagonal/peephole connections.
cell_clip: (optional) A float value, if provided the cell state is clipped by this value prior to the cell output activation.
initializer: (optional) The initializer to use for the weight and projection matrices.
num_unit_shards: int, How to split the weight matrix. If >1, the weight matrix is stored across num_unit_shards.
forget_bias: float, Biases of the forget gate are initialized by default to 1 in order to reduce the scale of forgetting at the beginning of the training.
feature_size: int, The size of the input feature the LSTM spans over.
frequency_skip: int, The amount the LSTM filter is shifted by in frequency.

`tf.contrib.rnn.TimeFreqLSTMCell.output_size` {#TimeFreqLSTMCell.output_size}

`tf.contrib.rnn.TimeFreqLSTMCell.state_size` {#TimeFreqLSTMCell.state_size}

`tf.contrib.rnn.TimeFreqLSTMCell.zero_state(batch_size, dtype)` {#TimeFreqLSTMCell.zero_state}

Return zero-filled state tensor(s).

Args:

batch_size: int, float, or unit Tensor representing the batch size.
dtype: the data type to use for the state.

Returns:

If state_size is an int or TensorShape, then the return value is a N-D tensor of shape [batch_size x state_size] filled with zeros.

If state_size is a nested list or tuple, then the return value is a nested list or tuple (of the same structure) of 2-D tensors with the shapes [batch_size x s] for each s in state_size.

`class tf.contrib.rnn.GridLSTMCell` {#GridLSTMCell}

Grid Long short-term memory unit (LSTM) recurrent network cell.

The default is based on: Nal Kalchbrenner, Ivo Danihelka and Alex Graves "Grid Long Short-Term Memory," Proc. ICLR 2016. http://arxiv.org/abs/1507.01526

When peephole connections are used, the implementation is based on: Tara N. Sainath and Bo Li "Modeling Time-Frequency Patterns with LSTM vs. Convolutional Architectures for LVCSR Tasks." submitted to INTERSPEECH, 2016.

The code uses optional peephole connections, shared_weights and cell clipping.

`tf.contrib.rnn.GridLSTMCell.call(inputs, state, scope=None)` {#GridLSTMCell.call}

Run one step of LSTM.

Args:

inputs: input Tensor, 2D, [batch, feature_size].
state: Tensor or tuple of Tensors, 2D, [batch, state_size], depends on the flag self._state_is_tuple.
scope: (optional) VariableScope for the created subgraph; if None, it defaults to "GridLSTMCell".

Returns:

A tuple containing:

A 2D, [batch, output_dim], Tensor representing the output of the LSTM after reading "inputs" when previous state was "state". Here output_dim is num_units.
A 2D, [batch, state_size], Tensor representing the new state of LSTM after reading "inputs" when previous state was "state".

Raises:

ValueError: if an input_size was specified and the provided inputs have a different dimension.

`tf.contrib.rnn.GridLSTMCell.init(num_units, use_peepholes=False, share_time_frequency_weights=False, cell_clip=None, initializer=None, num_unit_shards=1, forget_bias=1.0, feature_size=None, frequency_skip=None, num_frequency_blocks=None, start_freqindex_list=None, end_freqindex_list=None, couple_input_forget_gates=False, state_is_tuple=False)` {#GridLSTMCell.init}

Initialize the parameters for an LSTM cell.

Args:

num_units: int, The number of units in the LSTM cell
use_peepholes: (optional) bool, default False. Set True to enable diagonal/peephole connections.
share_time_frequency_weights: (optional) bool, default False. Set True to enable shared cell weights between time and frequency LSTMs.
cell_clip: (optional) A float value, default None, if provided the cell state is clipped by this value prior to the cell output activation.
initializer: (optional) The initializer to use for the weight and projection matrices, default None.
num_unit_shards: (optional) int, defualt 1, How to split the weight matrix. If > 1,the weight matrix is stored across num_unit_shards.
forget_bias: (optional) float, default 1.0, The initial bias of the forget gates, used to reduce the scale of forgetting at the beginning of the training.
feature_size: (optional) int, default None, The size of the input feature the LSTM spans over.
frequency_skip: (optional) int, default None, The amount the LSTM filter is shifted by in frequency.
num_frequency_blocks: [required] A list of frequency blocks needed to cover the whole input feature splitting defined by start_freqindex_list and end_freqindex_list.
start_freqindex_list: [optional], list of ints, default None, The starting frequency index for each frequency block.
end_freqindex_list: [optional], list of ints, default None. The ending frequency index for each frequency block.
couple_input_forget_gates: (optional) bool, default False, Whether to couple the input and forget gates, i.e. f_gate = 1.0 - i_gate, to reduce model parameters and computation cost.
state_is_tuple: If True, accepted and returned states are 2-tuples of the c_state and m_state. By default (False), they are concatenated along the column axis. This default behavior will soon be deprecated.

Raises:

ValueError: if the num_frequency_blocks list is not specified

`tf.contrib.rnn.GridLSTMCell.output_size` {#GridLSTMCell.output_size}

`tf.contrib.rnn.GridLSTMCell.state_size` {#GridLSTMCell.state_size}

`tf.contrib.rnn.GridLSTMCell.state_tuple_type` {#GridLSTMCell.state_tuple_type}

`tf.contrib.rnn.GridLSTMCell.zero_state(batch_size, dtype)` {#GridLSTMCell.zero_state}

Return zero-filled state tensor(s).

Args:

batch_size: int, float, or unit Tensor representing the batch size.
dtype: the data type to use for the state.

Returns:

If state_size is an int or TensorShape, then the return value is a N-D tensor of shape [batch_size x state_size] filled with zeros.

If state_size is a nested list or tuple, then the return value is a nested list or tuple (of the same structure) of 2-D tensors with the shapes [batch_size x s] for each s in state_size.

RNNCell wrappers

`class tf.contrib.rnn.AttentionCellWrapper` {#AttentionCellWrapper}

Basic attention cell wrapper.

Implementation based on https://arxiv.org/abs/1409.0473.

`tf.contrib.rnn.AttentionCellWrapper.call(inputs, state, scope=None)` {#AttentionCellWrapper.call}

Long short-term memory cell with attention (LSTMA).

`tf.contrib.rnn.AttentionCellWrapper.init(cell, attn_length, attn_size=None, attn_vec_size=None, input_size=None, state_is_tuple=False)` {#AttentionCellWrapper.init}

Create a cell with attention.

Args:

cell: an RNNCell, an attention is added to it.
attn_length: integer, the size of an attention window.
attn_size: integer, the size of an attention vector. Equal to cell.output_size by default.
attn_vec_size: integer, the number of convolutional features calculated on attention state and a size of the hidden layer built from base cell state. Equal attn_size to by default.
input_size: integer, the size of a hidden linear layer, built from inputs and attention. Derived from the input tensor by default.
state_is_tuple: If True, accepted and returned states are n-tuples, where n = len(cells). By default (False), the states are all concatenated along the column axis.

Raises:

TypeError: if cell is not an RNNCell.
ValueError: if cell returns a state tuple but the flag state_is_tuple is False or if attn_length is zero or less.

`tf.contrib.rnn.AttentionCellWrapper.output_size` {#AttentionCellWrapper.output_size}

`tf.contrib.rnn.AttentionCellWrapper.state_size` {#AttentionCellWrapper.state_size}

`tf.contrib.rnn.AttentionCellWrapper.zero_state(batch_size, dtype)` {#AttentionCellWrapper.zero_state}

Return zero-filled state tensor(s).

Args:

batch_size: int, float, or unit Tensor representing the batch size.
dtype: the data type to use for the state.

Returns:

If state_size is an int or TensorShape, then the return value is a N-D tensor of shape [batch_size x state_size] filled with zeros.

If state_size is a nested list or tuple, then the return value is a nested list or tuple (of the same structure) of 2-D tensors with the shapes [batch_size x s] for each s in state_size.

Recurrent Neural Networks

TensorFlow provides a number of methods for constructing Recurrent Neural Networks.

`tf.contrib.rnn.static_rnn(cell, inputs, initial_state=None, dtype=None, sequence_length=None, scope=None)` {#static_rnn}

Creates a recurrent neural network specified by RNNCell cell.

The simplest form of RNN network generated is:

  state = cell.zero_state(...)
  outputs = []
  for input_ in inputs:
    output, state = cell(input_, state)
    outputs.append(output)
  return (outputs, state)

However, a few other options are available:

An initial state can be provided. If the sequence_length vector is provided, dynamic calculation is performed. This method of calculation does not compute the RNN steps past the maximum sequence length of the minibatch (thus saving computational time), and properly propagates the state at an example's sequence length to the final state output.

The dynamic calculation performed is, at time t for batch row b,

  (output, state)(b, t) =
    (t >= sequence_length(b))
      ? (zeros(cell.output_size), states(b, sequence_length(b) - 1))
      : cell(input(b, t), state(b, t - 1))

Args:

cell: An instance of RNNCell.
inputs: A length T list of inputs, each a Tensor of shape [batch_size, input_size], or a nested tuple of such elements.
initial_state: (optional) An initial state for the RNN. If cell.state_size is an integer, this must be a Tensor of appropriate type and shape [batch_size, cell.state_size]. If cell.state_size is a tuple, this should be a tuple of tensors having shapes [batch_size, s] for s in cell.state_size.
dtype: (optional) The data type for the initial state and expected output. Required if initial_state is not provided or RNN state has a heterogeneous dtype.
sequence_length: Specifies the length of each sequence in inputs. An int32 or int64 vector (tensor) size [batch_size], values in [0, T).
scope: VariableScope for the created subgraph; defaults to "rnn".

Returns:

A pair (outputs, state) where:

outputs is a length T list of outputs (one for each input), or a nested tuple of such elements.
state is the final state

Raises:

TypeError: If cell is not an instance of RNNCell.
ValueError: If inputs is None or an empty list, or if the input depth (column size) cannot be inferred from inputs via shape inference.

`tf.contrib.rnn.static_state_saving_rnn(cell, inputs, state_saver, state_name, sequence_length=None, scope=None)` {#static_state_saving_rnn}

RNN that accepts a state saver for time-truncated RNN calculation.

Args:

cell: An instance of RNNCell.
inputs: A length T list of inputs, each a Tensor of shape [batch_size, input_size].
state_saver: A state saver object with methods state and save_state.
state_name: Python string or tuple of strings. The name to use with the state_saver. If the cell returns tuples of states (i.e., cell.state_size is a tuple) then state_name should be a tuple of strings having the same length as cell.state_size. Otherwise it should be a single string.
sequence_length: (optional) An int32/int64 vector size [batch_size]. See the documentation for rnn() for more details about sequence_length.
scope: VariableScope for the created subgraph; defaults to "rnn".

Returns:

A pair (outputs, state) where: outputs is a length T list of outputs (one for each input) states is the final state

Raises:

TypeError: If cell is not an instance of RNNCell.
ValueError: If inputs is None or an empty list, or if the arity and type of state_name does not match that of cell.state_size.

`tf.contrib.rnn.static_bidirectional_rnn(cell_fw, cell_bw, inputs, initial_state_fw=None, initial_state_bw=None, dtype=None, sequence_length=None, scope=None)` {#static_bidirectional_rnn}

Creates a bidirectional recurrent neural network.

Similar to the unidirectional case above (rnn) but takes input and builds independent forward and backward RNNs with the final forward and backward outputs depth-concatenated, such that the output will have the format [time][batch][cell_fw.output_size + cell_bw.output_size]. The input_size of forward and backward cell must match. The initial state for both directions is zero by default (but can be set optionally) and no intermediate states are ever returned -- the network is fully unrolled for the given (passed in) length(s) of the sequence(s) or completely unrolled if length(s) is not given.

Args:

cell_fw: An instance of RNNCell, to be used for forward direction.
cell_bw: An instance of RNNCell, to be used for backward direction.
inputs: A length T list of inputs, each a tensor of shape [batch_size, input_size], or a nested tuple of such elements.
initial_state_fw: (optional) An initial state for the forward RNN. This must be a tensor of appropriate type and shape [batch_size, cell_fw.state_size]. If cell_fw.state_size is a tuple, this should be a tuple of tensors having shapes [batch_size, s] for s in cell_fw.state_size.
initial_state_bw: (optional) Same as for initial_state_fw, but using the corresponding properties of cell_bw.
dtype: (optional) The data type for the initial state. Required if either of the initial states are not provided.
sequence_length: (optional) An int32/int64 vector, size [batch_size], containing the actual lengths for each of the sequences.
scope: VariableScope for the created subgraph; defaults to "bidirectional_rnn"

Returns:

A tuple (outputs, output_state_fw, output_state_bw) where: outputs is a length T list of outputs (one for each input), which are depth-concatenated forward and backward outputs. output_state_fw is the final state of the forward rnn. output_state_bw is the final state of the backward rnn.

Raises:

TypeError: If cell_fw or cell_bw is not an instance of RNNCell.
ValueError: If inputs is None or an empty list.

Other Functions and Classes

`class tf.contrib.rnn.BidirectionalGridLSTMCell` {#BidirectionalGridLSTMCell}

Bidirectional GridLstm cell.

The bidirection connection is only used in the frequency direction, which hence doesn't affect the time direction's real-time processing that is required for online recognition systems. The current implementation uses different weights for the two directions.

`tf.contrib.rnn.BidirectionalGridLSTMCell.call(inputs, state, scope=None)` {#BidirectionalGridLSTMCell.call}

Run one step of LSTM.

Args:

inputs: input Tensor, 2D, [batch, num_units].
state: tuple of Tensors, 2D, [batch, state_size].
scope: (optional) VariableScope for the created subgraph; if None, it defaults to "BidirectionalGridLSTMCell".

Returns:

A tuple containing:

A 2D, [batch, output_dim], Tensor representing the output of the LSTM after reading "inputs" when previous state was "state". Here output_dim is num_units.
A 2D, [batch, state_size], Tensor representing the new state of LSTM after reading "inputs" when previous state was "state".

Raises:

ValueError: if an input_size was specified and the provided inputs have a different dimension.

`tf.contrib.rnn.BidirectionalGridLSTMCell.init(num_units, use_peepholes=False, share_time_frequency_weights=False, cell_clip=None, initializer=None, num_unit_shards=1, forget_bias=1.0, feature_size=None, frequency_skip=None, num_frequency_blocks=None, start_freqindex_list=None, end_freqindex_list=None, couple_input_forget_gates=False, backward_slice_offset=0)` {#BidirectionalGridLSTMCell.init}

Initialize the parameters for an LSTM cell.

Args:

num_units: int, The number of units in the LSTM cell
use_peepholes: (optional) bool, default False. Set True to enable diagonal/peephole connections.
share_time_frequency_weights: (optional) bool, default False. Set True to enable shared cell weights between time and frequency LSTMs.
cell_clip: (optional) A float value, default None, if provided the cell state is clipped by this value prior to the cell output activation.
initializer: (optional) The initializer to use for the weight and projection matrices, default None.
num_unit_shards: (optional) int, defualt 1, How to split the weight matrix. If > 1,the weight matrix is stored across num_unit_shards.
forget_bias: (optional) float, default 1.0, The initial bias of the forget gates, used to reduce the scale of forgetting at the beginning of the training.
feature_size: (optional) int, default None, The size of the input feature the LSTM spans over.
frequency_skip: (optional) int, default None, The amount the LSTM filter is shifted by in frequency.
num_frequency_blocks: [required] A list of frequency blocks needed to cover the whole input feature splitting defined by start_freqindex_list and end_freqindex_list.
start_freqindex_list: [optional], list of ints, default None, The starting frequency index for each frequency block.
end_freqindex_list: [optional], list of ints, default None. The ending frequency index for each frequency block.
couple_input_forget_gates: (optional) bool, default False, Whether to couple the input and forget gates, i.e. f_gate = 1.0 - i_gate, to reduce model parameters and computation cost.
backward_slice_offset: (optional) int32, default 0, the starting offset to slice the feature for backward processing.

`tf.contrib.rnn.BidirectionalGridLSTMCell.output_size` {#BidirectionalGridLSTMCell.output_size}

`tf.contrib.rnn.BidirectionalGridLSTMCell.state_size` {#BidirectionalGridLSTMCell.state_size}

`tf.contrib.rnn.BidirectionalGridLSTMCell.state_tuple_type` {#BidirectionalGridLSTMCell.state_tuple_type}

`tf.contrib.rnn.BidirectionalGridLSTMCell.zero_state(batch_size, dtype)` {#BidirectionalGridLSTMCell.zero_state}

Return zero-filled state tensor(s).

Args:

batch_size: int, float, or unit Tensor representing the batch size.
dtype: the data type to use for the state.

Returns:

If state_size is an int or TensorShape, then the return value is a N-D tensor of shape [batch_size x state_size] filled with zeros.

If state_size is a nested list or tuple, then the return value is a nested list or tuple (of the same structure) of 2-D tensors with the shapes [batch_size x s] for each s in state_size.

`class tf.contrib.rnn.LSTMBlockWrapper` {#LSTMBlockWrapper}

This is a helper class that provides housekeeping for LSTM cells.

This may be useful for alternative LSTM and similar type of cells. The subclasses must implement _call_cell method and num_units property.

`tf.contrib.rnn.LSTMBlockWrapper.call(inputs, initial_state=None, dtype=None, sequence_length=None, scope=None)` {#LSTMBlockWrapper.call}

Run this LSTM on inputs, starting from the given state.

Args:

inputs: 3-D tensor with shape [time_len, batch_size, input_size] or a list of time_len tensors of shape [batch_size, input_size].
initial_state: a tuple (initial_cell_state, initial_output) with tensors of shape [batch_size, self._num_units]. If this is not provided, the cell is expected to create a zero initial state of type dtype.
dtype: The data type for the initial state and expected output. Required if initial_state is not provided or RNN state has a heterogeneous dtype.
sequence_length: Specifies the length of each sequence in inputs. An int32 or int64 vector (tensor) size [batch_size], values in [0, time_len). Defaults to time_len for each element.
scope: VariableScope for the created subgraph; defaults to class name.

Returns:

A pair containing:

Output: A 3-D tensor of shape [time_len, batch_size, output_size] or a list of time_len tensors of shape [batch_size, output_size], to match the type of the inputs.
Final state: a tuple (cell_state, output) matching initial_state.

Raises:

ValueError: in case of shape mismatches

`tf.contrib.rnn.LSTMBlockWrapper.num_units` {#LSTMBlockWrapper.num_units}

Number of units in this cell (output dimension).

`class tf.contrib.rnn.LayerNormBasicLSTMCell` {#LayerNormBasicLSTMCell}

LSTM unit with layer normalization and recurrent dropout.

This class adds layer normalization and recurrent dropout to a basic LSTM unit. Layer normalization implementation is based on:

https://arxiv.org/abs/1607.06450.

"Layer Normalization" Jimmy Lei Ba, Jamie Ryan Kiros, Geoffrey E. Hinton

and is applied before the internal nonlinearities. Recurrent dropout is base on:

https://arxiv.org/abs/1603.05118

"Recurrent Dropout without Memory Loss" Stanislau Semeniuta, Aliaksei Severyn, Erhardt Barth.

`tf.contrib.rnn.LayerNormBasicLSTMCell.call(inputs, state, scope=None)` {#LayerNormBasicLSTMCell.call}

LSTM cell with layer normalization and recurrent dropout.

`tf.contrib.rnn.LayerNormBasicLSTMCell.init(num_units, forget_bias=1.0, input_size=None, activation=tanh, layer_norm=True, norm_gain=1.0, norm_shift=0.0, dropout_keep_prob=1.0, dropout_prob_seed=None)` {#LayerNormBasicLSTMCell.init}

Initializes the basic LSTM cell.

Args:

num_units: int, The number of units in the LSTM cell.
forget_bias: float, The bias added to forget gates (see above).
input_size: Deprecated and unused.
activation: Activation function of the inner states.
layer_norm: If True, layer normalization will be applied.
norm_gain: float, The layer normalization gain initial value. If layer_norm has been set to False, this argument will be ignored.
norm_shift: float, The layer normalization shift initial value. If layer_norm has been set to False, this argument will be ignored.
dropout_keep_prob: unit Tensor or float between 0 and 1 representing the recurrent dropout probability value. If float and 1.0, no dropout will be applied.
dropout_prob_seed: (optional) integer, the randomness seed.

`tf.contrib.rnn.LayerNormBasicLSTMCell.output_size` {#LayerNormBasicLSTMCell.output_size}

`tf.contrib.rnn.LayerNormBasicLSTMCell.state_size` {#LayerNormBasicLSTMCell.state_size}

`tf.contrib.rnn.LayerNormBasicLSTMCell.zero_state(batch_size, dtype)` {#LayerNormBasicLSTMCell.zero_state}

Return zero-filled state tensor(s).

Args:

batch_size: int, float, or unit Tensor representing the batch size.
dtype: the data type to use for the state.

Returns:

If state_size is an int or TensorShape, then the return value is a N-D tensor of shape [batch_size x state_size] filled with zeros.

If state_size is a nested list or tuple, then the return value is a nested list or tuple (of the same structure) of 2-D tensors with the shapes [batch_size x s] for each s in state_size.

`tf.contrib.rnn.stack_bidirectional_dynamic_rnn(cells_fw, cells_bw, inputs, initial_states_fw=None, initial_states_bw=None, dtype=None, sequence_length=None, scope=None)` {#stack_bidirectional_dynamic_rnn}

Creates a dynamic bidirectional recurrent neural network.

Stacks several bidirectional rnn layers. The combined forward and backward layer outputs are used as input of the next layer. tf.bidirectional_rnn does not allow to share forward and backward information between layers. The input_size of the first forward and backward cells must match. The initial state for both directions is zero and no intermediate states are returned.

Args:

cells_fw: List of instances of RNNCell, one per layer, to be used for forward direction.
cells_bw: List of instances of RNNCell, one per layer, to be used for backward direction.
inputs: A length T list of inputs, each a tensor of shape [batch_size, input_size], or a nested tuple of such elements.
initial_states_fw: (optional) A list of the initial states (one per layer) for the forward RNN. Each tensor must has an appropriate type and shape [batch_size, cell_fw.state_size].
initial_states_bw: (optional) Same as for initial_states_fw, but using the corresponding properties of cells_bw.
dtype: (optional) The data type for the initial state. Required if either of the initial states are not provided.
sequence_length: (optional) An int32/int64 vector, size [batch_size], containing the actual lengths for each of the sequences.
scope: VariableScope for the created subgraph; defaults to None.

Returns:

A tuple (outputs, output_state_fw, output_state_bw) where:

outputs: Output Tensor shaped: batch_size, max_time, layers_output]. Where layers_output are depth-concatenated forward and backward outputs. output_states_fw is the final states, one tensor per layer, of the forward rnn. output_states_bw is the final states, one tensor per layer, of the backward rnn.

Raises:

TypeError: If cell_fw or cell_bw is not an instance of RNNCell.
ValueError: If inputs is None, not a list or an empty list.

`tf.contrib.rnn.stack_bidirectional_rnn(cells_fw, cells_bw, inputs, initial_states_fw=None, initial_states_bw=None, dtype=None, sequence_length=None, scope=None)` {#stack_bidirectional_rnn}

Creates a bidirectional recurrent neural network.

Stacks several bidirectional rnn layers. The combined forward and backward layer outputs are used as input of the next layer. tf.bidirectional_rnn does not allow to share forward and backward information between layers. The input_size of the first forward and backward cells must match. The initial state for both directions is zero and no intermediate states are returned.

As described in https://arxiv.org/abs/1303.5778

Args:

cells_fw: List of instances of RNNCell, one per layer, to be used for forward direction.
cells_bw: List of instances of RNNCell, one per layer, to be used for backward direction.
inputs: A length T list of inputs, each a tensor of shape [batch_size, input_size], or a nested tuple of such elements.
initial_states_fw: (optional) A list of the initial states (one per layer) for the forward RNN. Each tensor must has an appropriate type and shape [batch_size, cell_fw.state_size].
initial_states_bw: (optional) Same as for initial_states_fw, but using the corresponding properties of cells_bw.
dtype: (optional) The data type for the initial state. Required if either of the initial states are not provided.
sequence_length: (optional) An int32/int64 vector, size [batch_size], containing the actual lengths for each of the sequences.
scope: VariableScope for the created subgraph; defaults to None.

Returns:

A tuple (outputs, output_state_fw, output_state_bw) where: outputs is a length T list of outputs (one for each input), which are depth-concatenated forward and backward outputs. output_states_fw is the final states, one tensor per layer, of the forward rnn. output_states_bw is the final states, one tensor per layer, of the backward rnn.

Raises:

TypeError: If cell_fw or cell_bw is not an instance of RNNCell.
ValueError: If inputs is None, not a list or an empty list.

FilesExpand file tree

contrib.rnn.md

Latest commit

History

contrib.rnn.md

File metadata and controls

RNN and Cells (contrib)

Base interface for all RNN Cells

class tf.contrib.rnn.RNNCell {#RNNCell}

tf.contrib.rnn.RNNCell.__call__(inputs, state, scope=None) {#RNNCell.call}

Args:

Returns:

tf.contrib.rnn.RNNCell.output_size {#RNNCell.output_size}

tf.contrib.rnn.RNNCell.state_size {#RNNCell.state_size}

tf.contrib.rnn.RNNCell.zero_state(batch_size, dtype) {#RNNCell.zero_state}

Args:

Returns:

RNN Cells for use with TensorFlow's core RNN methods

class tf.contrib.rnn.BasicRNNCell {#BasicRNNCell}

tf.contrib.rnn.BasicRNNCell.__call__(inputs, state, scope=None) {#BasicRNNCell.call}

tf.contrib.rnn.BasicRNNCell.__init__(num_units, input_size=None, activation=tanh) {#BasicRNNCell.init}

tf.contrib.rnn.BasicRNNCell.output_size {#BasicRNNCell.output_size}

tf.contrib.rnn.BasicRNNCell.state_size {#BasicRNNCell.state_size}

tf.contrib.rnn.BasicRNNCell.zero_state(batch_size, dtype) {#BasicRNNCell.zero_state}

Args:

Returns:

class tf.contrib.rnn.BasicLSTMCell {#BasicLSTMCell}

tf.contrib.rnn.BasicLSTMCell.__call__(inputs, state, scope=None) {#BasicLSTMCell.call}

tf.contrib.rnn.BasicLSTMCell.__init__(num_units, forget_bias=1.0, input_size=None, state_is_tuple=True, activation=tanh) {#BasicLSTMCell.init}

Args:

tf.contrib.rnn.BasicLSTMCell.output_size {#BasicLSTMCell.output_size}

tf.contrib.rnn.BasicLSTMCell.state_size {#BasicLSTMCell.state_size}

tf.contrib.rnn.BasicLSTMCell.zero_state(batch_size, dtype) {#BasicLSTMCell.zero_state}

Args:

Returns:

class tf.contrib.rnn.GRUCell {#GRUCell}

tf.contrib.rnn.GRUCell.__call__(inputs, state, scope=None) {#GRUCell.call}

tf.contrib.rnn.GRUCell.__init__(num_units, input_size=None, activation=tanh) {#GRUCell.init}

tf.contrib.rnn.GRUCell.output_size {#GRUCell.output_size}

tf.contrib.rnn.GRUCell.state_size {#GRUCell.state_size}

tf.contrib.rnn.GRUCell.zero_state(batch_size, dtype) {#GRUCell.zero_state}

Args:

Returns:

class tf.contrib.rnn.LSTMCell {#LSTMCell}

tf.contrib.rnn.LSTMCell.__call__(inputs, state, scope=None) {#LSTMCell.call}

Args:

Returns:

Raises:

tf.contrib.rnn.LSTMCell.__init__(num_units, input_size=None, use_peepholes=False, cell_clip=None, initializer=None, num_proj=None, proj_clip=None, num_unit_shards=None, num_proj_shards=None, forget_bias=1.0, state_is_tuple=True, activation=tanh) {#LSTMCell.init}

Args:

tf.contrib.rnn.LSTMCell.output_size {#LSTMCell.output_size}

tf.contrib.rnn.LSTMCell.state_size {#LSTMCell.state_size}

tf.contrib.rnn.LSTMCell.zero_state(batch_size, dtype) {#LSTMCell.zero_state}

Args:

Returns:

Classes storing split RNNCell state

class tf.contrib.rnn.LSTMStateTuple {#LSTMStateTuple}

tf.contrib.rnn.LSTMStateTuple.__getnewargs__() {#LSTMStateTuple.getnewargs}

tf.contrib.rnn.LSTMStateTuple.__getstate__() {#LSTMStateTuple.getstate}

tf.contrib.rnn.LSTMStateTuple.__new__(_cls, c, h) {#LSTMStateTuple.new}

tf.contrib.rnn.LSTMStateTuple.__repr__() {#LSTMStateTuple.repr}

tf.contrib.rnn.LSTMStateTuple.c {#LSTMStateTuple.c}

tf.contrib.rnn.LSTMStateTuple.dtype {#LSTMStateTuple.dtype}

tf.contrib.rnn.LSTMStateTuple.h {#LSTMStateTuple.h}

RNN Cell wrappers (RNNCells that wrap other RNNCells)

class tf.contrib.rnn.MultiRNNCell {#MultiRNNCell}

tf.contrib.rnn.MultiRNNCell.__call__(inputs, state, scope=None) {#MultiRNNCell.call}

tf.contrib.rnn.MultiRNNCell.__init__(cells, state_is_tuple=True) {#MultiRNNCell.init}

Args:

Raises:

tf.contrib.rnn.MultiRNNCell.output_size {#MultiRNNCell.output_size}

tf.contrib.rnn.MultiRNNCell.state_size {#MultiRNNCell.state_size}

tf.contrib.rnn.MultiRNNCell.zero_state(batch_size, dtype) {#MultiRNNCell.zero_state}

Args:

Returns:

class tf.contrib.rnn.DropoutWrapper {#DropoutWrapper}

tf.contrib.rnn.DropoutWrapper.__call__(inputs, state, scope=None) {#DropoutWrapper.call}

tf.contrib.rnn.DropoutWrapper.__init__(cell, input_keep_prob=1.0, output_keep_prob=1.0, seed=None) {#DropoutWrapper.init}

Args:

Raises:

`class tf.contrib.rnn.RNNCell` {#RNNCell}

`tf.contrib.rnn.RNNCell.call(inputs, state, scope=None)` {#RNNCell.call}

`tf.contrib.rnn.RNNCell.output_size` {#RNNCell.output_size}

`tf.contrib.rnn.RNNCell.state_size` {#RNNCell.state_size}

`tf.contrib.rnn.RNNCell.zero_state(batch_size, dtype)` {#RNNCell.zero_state}

`class tf.contrib.rnn.BasicRNNCell` {#BasicRNNCell}

`tf.contrib.rnn.BasicRNNCell.call(inputs, state, scope=None)` {#BasicRNNCell.call}

`tf.contrib.rnn.BasicRNNCell.init(num_units, input_size=None, activation=tanh)` {#BasicRNNCell.init}

`tf.contrib.rnn.BasicRNNCell.output_size` {#BasicRNNCell.output_size}

`tf.contrib.rnn.BasicRNNCell.state_size` {#BasicRNNCell.state_size}

`tf.contrib.rnn.BasicRNNCell.zero_state(batch_size, dtype)` {#BasicRNNCell.zero_state}

`class tf.contrib.rnn.BasicLSTMCell` {#BasicLSTMCell}

`tf.contrib.rnn.BasicLSTMCell.call(inputs, state, scope=None)` {#BasicLSTMCell.call}

`tf.contrib.rnn.BasicLSTMCell.init(num_units, forget_bias=1.0, input_size=None, state_is_tuple=True, activation=tanh)` {#BasicLSTMCell.init}

`tf.contrib.rnn.BasicLSTMCell.output_size` {#BasicLSTMCell.output_size}

`tf.contrib.rnn.BasicLSTMCell.state_size` {#BasicLSTMCell.state_size}

`tf.contrib.rnn.BasicLSTMCell.zero_state(batch_size, dtype)` {#BasicLSTMCell.zero_state}

`class tf.contrib.rnn.GRUCell` {#GRUCell}

`tf.contrib.rnn.GRUCell.call(inputs, state, scope=None)` {#GRUCell.call}

`tf.contrib.rnn.GRUCell.init(num_units, input_size=None, activation=tanh)` {#GRUCell.init}

`tf.contrib.rnn.GRUCell.output_size` {#GRUCell.output_size}

`tf.contrib.rnn.GRUCell.state_size` {#GRUCell.state_size}

`tf.contrib.rnn.GRUCell.zero_state(batch_size, dtype)` {#GRUCell.zero_state}

`class tf.contrib.rnn.LSTMCell` {#LSTMCell}

`tf.contrib.rnn.LSTMCell.call(inputs, state, scope=None)` {#LSTMCell.call}

`tf.contrib.rnn.LSTMCell.init(num_units, input_size=None, use_peepholes=False, cell_clip=None, initializer=None, num_proj=None, proj_clip=None, num_unit_shards=None, num_proj_shards=None, forget_bias=1.0, state_is_tuple=True, activation=tanh)` {#LSTMCell.init}

`tf.contrib.rnn.LSTMCell.output_size` {#LSTMCell.output_size}

`tf.contrib.rnn.LSTMCell.state_size` {#LSTMCell.state_size}

`tf.contrib.rnn.LSTMCell.zero_state(batch_size, dtype)` {#LSTMCell.zero_state}

Classes storing split `RNNCell` state

`class tf.contrib.rnn.LSTMStateTuple` {#LSTMStateTuple}

`tf.contrib.rnn.LSTMStateTuple.getnewargs()` {#LSTMStateTuple.getnewargs}

`tf.contrib.rnn.LSTMStateTuple.getstate()` {#LSTMStateTuple.getstate}

`tf.contrib.rnn.LSTMStateTuple.new(_cls, c, h)` {#LSTMStateTuple.new}

`tf.contrib.rnn.LSTMStateTuple.repr()` {#LSTMStateTuple.repr}

`tf.contrib.rnn.LSTMStateTuple.c` {#LSTMStateTuple.c}

`tf.contrib.rnn.LSTMStateTuple.dtype` {#LSTMStateTuple.dtype}

`tf.contrib.rnn.LSTMStateTuple.h` {#LSTMStateTuple.h}

`class tf.contrib.rnn.MultiRNNCell` {#MultiRNNCell}

`tf.contrib.rnn.MultiRNNCell.call(inputs, state, scope=None)` {#MultiRNNCell.call}

`tf.contrib.rnn.MultiRNNCell.init(cells, state_is_tuple=True)` {#MultiRNNCell.init}

`tf.contrib.rnn.MultiRNNCell.output_size` {#MultiRNNCell.output_size}

`tf.contrib.rnn.MultiRNNCell.state_size` {#MultiRNNCell.state_size}

`tf.contrib.rnn.MultiRNNCell.zero_state(batch_size, dtype)` {#MultiRNNCell.zero_state}

`class tf.contrib.rnn.DropoutWrapper` {#DropoutWrapper}

`tf.contrib.rnn.DropoutWrapper.call(inputs, state, scope=None)` {#DropoutWrapper.call}

`tf.contrib.rnn.DropoutWrapper.init(cell, input_keep_prob=1.0, output_keep_prob=1.0, seed=None)` {#DropoutWrapper.init}

`tf.contrib.rnn.DropoutWrapper.output_size` {#DropoutWrapper.output_size}

`tf.contrib.rnn.DropoutWrapper.state_size` {#DropoutWrapper.state_size}

`tf.contrib.rnn.DropoutWrapper.zero_state(batch_size, dtype)` {#DropoutWrapper.zero_state}

`class tf.contrib.rnn.EmbeddingWrapper` {#EmbeddingWrapper}

`tf.contrib.rnn.EmbeddingWrapper.call(inputs, state, scope=None)` {#EmbeddingWrapper.call}

`tf.contrib.rnn.EmbeddingWrapper.init(cell, embedding_classes, embedding_size, initializer=None)` {#EmbeddingWrapper.init}

`tf.contrib.rnn.EmbeddingWrapper.output_size` {#EmbeddingWrapper.output_size}

`tf.contrib.rnn.EmbeddingWrapper.state_size` {#EmbeddingWrapper.state_size}

`tf.contrib.rnn.EmbeddingWrapper.zero_state(batch_size, dtype)` {#EmbeddingWrapper.zero_state}

`class tf.contrib.rnn.InputProjectionWrapper` {#InputProjectionWrapper}

`tf.contrib.rnn.InputProjectionWrapper.call(inputs, state, scope=None)` {#InputProjectionWrapper.call}

`tf.contrib.rnn.InputProjectionWrapper.init(cell, num_proj, input_size=None)` {#InputProjectionWrapper.init}

`tf.contrib.rnn.InputProjectionWrapper.output_size` {#InputProjectionWrapper.output_size}

`tf.contrib.rnn.InputProjectionWrapper.state_size` {#InputProjectionWrapper.state_size}

`tf.contrib.rnn.InputProjectionWrapper.zero_state(batch_size, dtype)` {#InputProjectionWrapper.zero_state}

`class tf.contrib.rnn.OutputProjectionWrapper` {#OutputProjectionWrapper}

`tf.contrib.rnn.OutputProjectionWrapper.call(inputs, state, scope=None)` {#OutputProjectionWrapper.call}

`tf.contrib.rnn.OutputProjectionWrapper.init(cell, output_size)` {#OutputProjectionWrapper.init}

`tf.contrib.rnn.OutputProjectionWrapper.output_size` {#OutputProjectionWrapper.output_size}

`tf.contrib.rnn.OutputProjectionWrapper.state_size` {#OutputProjectionWrapper.state_size}

`tf.contrib.rnn.OutputProjectionWrapper.zero_state(batch_size, dtype)` {#OutputProjectionWrapper.zero_state}

`class tf.contrib.rnn.LSTMBlockCell` {#LSTMBlockCell}

`tf.contrib.rnn.LSTMBlockCell.call(x, states_prev, scope=None)` {#LSTMBlockCell.call}

`tf.contrib.rnn.LSTMBlockCell.init(num_units, forget_bias=1.0, use_peephole=False)` {#LSTMBlockCell.init}

`tf.contrib.rnn.LSTMBlockCell.output_size` {#LSTMBlockCell.output_size}

`tf.contrib.rnn.LSTMBlockCell.state_size` {#LSTMBlockCell.state_size}

`tf.contrib.rnn.LSTMBlockCell.zero_state(batch_size, dtype)` {#LSTMBlockCell.zero_state}

`class tf.contrib.rnn.GRUBlockCell` {#GRUBlockCell}

`tf.contrib.rnn.GRUBlockCell.call(x, h_prev, scope=None)` {#GRUBlockCell.call}

`tf.contrib.rnn.GRUBlockCell.init(cell_size)` {#GRUBlockCell.init}

`tf.contrib.rnn.GRUBlockCell.output_size` {#GRUBlockCell.output_size}

`tf.contrib.rnn.GRUBlockCell.state_size` {#GRUBlockCell.state_size}

`tf.contrib.rnn.GRUBlockCell.zero_state(batch_size, dtype)` {#GRUBlockCell.zero_state}