Recurrent Neural Networks (RNN)
March 24, 2018
A Recurrent Neural Network (RNN) is a neural network that exibits an internal state. RNN are suitable to process sequential data, e.g. text.
In Fig 1 is visible a compact representation of a RNN. The elaboration is performed in an iterative fashion. At each step an input $\vec{x}$ is presented to the network, together with the state at previous iteration. The state at previous iteration is represented by the black box that is a delay of one iteration.
In this formalization1 $R$ and $O$ represent two functions with shared parameters $\theta$. Those functions control the network, $R$ the internal state and $O$ the output.
In detail $\vec{x}$, $\vec{y}$ and $\vec{s}$ are sequences of vectors. We use the notation to denote the $i$-th vector of the sequence, and the subsequence from $i$ to $j$. If the sequence is composed from $n$ vectors, we can write in this notation: .
The RNN model is espressed with:
where is the initialization state and $\theta$ are the parameters for the training. $R$ is a recursive function that updates the internal state based on the current input , and $O$ is the output function that elaborate the output based on the current status.
It can be useful to visualize the model unfolded as in Fig 2 in order to better understand the concepts.
References
Footnotes
-
Note that in literature you can find different notations. ↩︎