Encoder Decoder Architecture
Encoder-Decoder architecture was designed to tackle a specific task known as sequence-to-sequence learning, which addresses the problem in traditional RNN. In general, RNNs are very good at working with models in which the dimensionality of both inputs and outputs is known and fixed. This represents a tangible limitation as many problems in the real world are best expressed with sequences whose lengths are not known in advance. For example, machine translation or question answering models translate an input sentence to an output of a different length, which is not known upfront. As its name indicates, sequence-to-sequence learning attempts to map the input to output sequence of various sizes.
Encoder-Decoder architectures are the acceptable solution to sequence-to-sequence learning. In order to fully understand the model’s underlying logic, we will go over the below illustration:
As shown above, an ender-decoder model consists of three fundamental components:
Encoder: An RNN that reads the input sequence and translates it into a single-dimensional hidden vector.
Encoder Vector (Hidden Vector): Captures a temporary, hidden state of the input which is used to compute a context variable that represents a semantic summary of the input sentence.
Decoder: An RNN that takes the hidden vector as an input and generates the output sequence.