RNN - Vivek's Digital Garden

## Tensorflow RNN [[LSTM]]s can be one direction or bi-directional. Bid-directional LSTMs are better for text processing since they can pull in the context from later in the sentence. ```python tf.keras.layers.LSTM( units, activation='tanh', recurrent_activation='sigmoid', use_bias=True, kernel_initializer='glorot_uniform', recurrent_initializer='orthogonal', bias_initializer='zeros', unit_forget_bias=True, kernel_regularizer=None, recurrent_regularizer=None, bias_regularizer=None, activity_regularizer=None, kernel_constraint=None, recurrent_constraint=None, bias_constraint=None, dropout=0.0, recurrent_dropout=0.0, return_sequences=False, return_state=False, go_backwards=False, stateful=False, time_major=False, unroll=False, **kwargs) ``` `units` refers to the dimension of the inner cells in LSTM. They can be stacked with other Dense and LSTM units as shown below. If stacked with other LSTM units, it is necessary to use the `return_sequences=True` option ```python model = tf.keras.Sequential([ tf.keras.layers.Embedding(tokenizer.vocab_size, 64), tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(64, return_sequences=True)), tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(32)), tf.keras.layers.Dense(64, activation='relu'), tf.keras.layers.Dense(1, activation='sigmoid') ]) ``` ![[Pasted image 20210117225510.png]] In above picture dimensions of $C_t = C_{t-1} = h_t = h_{t-1} = Output_t$ If you set `return_sequence=True`, it will return something with shape: `(batch_size, timespan, unit)`. If `false`, then it just return the last output in shape `(batch_size, unit)` As for the input, you should provide input for every timestamp. Basically, the shape is like `(batch_size, timespan, input_dim)` ![[Pasted image 20210117230905.png]] When `return_sequences` is `False` (by default), then it is **many to one** as shown in the picture. Its return shape is `(batch_size, unit)` which represents the last state. When `return_sequences` is `True`, then it is **many to many**. Its return shape is `(batch_size, time_step, hidden_unit_length)` In keras number of pink boxes always = number of green boxes. Just pad with zeros if one-to-many is desired