Recurrent Neural Networks (RNN)

March 24, 2018

Stefano Martina

A Recurrent Neural Network (RNN) is a neural network that exibits an internal state. RNN are suitable to process sequential data, e.g. text.

In Fig 1 is visible a compact representation of a RNN. The elaboration is performed in an iterative fashion. At each step an input $\vec{x}$ is presented to the network, together with the state at previous iteration. The state at previous iteration is represented by the black box that is a delay of one iteration.

In this formalization¹ $R$ and $O$ represent two functions with shared parameters $\theta$. Those functions control the network, $R$ the internal state and $O$ the output.

In detail $\vec{x}$, $\vec{y}$ and $\vec{s}$ are sequences of vectors. We use the notation $\vec{x}_i$ to denote the $i$-th vector of the sequence, and $\vec{x}_{i:j}$ the subsequence from $i$ to $j$. If the sequence is composed from $n$ vectors, we can write in this notation: $\vec{x} = \vec{x}_{1:n}$ .

The RNN model is espressed with:

$\begin{eqnarray*} \vec{y}_{1:n} &=& RNN^*(\vec{x}_{1:n};\vec{s}_0,\theta)\\ \vec{y}_i &=& RNN(\vec{x}_{1:i};\vec{s}_0,\theta)\\ \vec{y}_i &=& O(\vec{s}_i;\theta) \\ \vec{s}_i &=& R(\vec{s}_{i-1},\vec{x}_i;\theta) \end{eqnarray*}$

where $\vec{s}_0$ is the initialization state and $\theta$ are the parameters for the training. $R$ is a recursive function that updates the internal state based on the current input $\vec{x}_i$ , and $O$ is the output function that elaborate the output $\vec{y}_i$ based on the current status.

It can be useful to visualize the model unfolded as in Fig 2 in order to better understand the concepts.

References

[Goldberg2017] Goldberg, Yoav (2017). Neural Network Methods for Natural Language Processing. Morgan & Claypool Publishers.

Footnotes

Note that in literature you can find different notations. ↩︎

Trianam's notes a blog about machine learning and artificial intelligence

Recurrent Neural Networks (RNN)

References

Footnotes

Trianam's notes
a blog about machine learning and artificial intelligence