Trianam's notes
a blog about machine learning and artificial intelligence

Recurrent Neural Networks (RNN)

March 24, 2018

Stefano Martina



A Recurrent Neural Network (RNN) is a neural network that exibits an internal state. RNN are suitable to process sequential data, e.g. text.

A RNN.
Fig 1: A RNN.

In Fig 1 is visible a compact representation of a RNN. The elaboration is performed in an iterative fashion. At each step an input $\vec{x}$ is presented to the network, together with the state at previous iteration. The state at previous iteration is represented by the black box that is a delay of one iteration.

In this formalization1 $R$ and $O$ represent two functions with shared parameters $\theta$. Those functions control the network, $R$ the internal state and $O$ the output.

In detail $\vec{x}$, $\vec{y}$ and $\vec{s}$ are sequences of vectors. We use the notation to denote the $i$-th vector of the sequence, and the subsequence from $i$ to $j$. If the sequence is composed from $n$ vectors, we can write in this notation: .

The RNN model is espressed with:

where is the initialization state and $\theta$ are the parameters for the training. $R$ is a recursive function that updates the internal state based on the current input , and $O$ is the output function that elaborate the output based on the current status.

An unfolded RNN.
Fig 2: An unfolded RNN.

It can be useful to visualize the model unfolded as in Fig 2 in order to better understand the concepts.



References

[Goldberg2017] Goldberg, Yoav (2017). Neural Network Methods for Natural Language Processing. Morgan & Claypool Publishers.



Footnotes

  1. Note that in literature you can find different notations. ↩︎