Bidirectional RNN
Standard RNNs process sequences only forward. For some tasks (e.g., part‑of‑speech tagging, named entity recognition), future context is also important. Bidirectional RNNs (BiRNNs) process the sequence both forward and backward, then combine the hidden states.
How It Works
Two separate RNNs (or LSTMs) run in opposite directions:
- Forward RNN reads sequence from t=1 to T.
- Backward RNN reads from t=T to 1.
- Output at each step is the concatenation (or sum) of forward and backward hidden states.
h_t_forward = RNN_forward(x_t, h_{t-1}_forward)
h_t_backward = RNN_backward(x_t, h_{t+1}_backward)
h_t = [h_t_forward, h_t_backward]When to Use BiRNNs
- When the entire sequence is available (not real‑time streaming).
- Tasks where context from both sides is useful: text classification, named entity recognition, speech recognition, machine translation (encoder part).
Example in PyTorch
self.lstm = nn.LSTM(input_size, hidden_size, bidirectional=True)
output, (h_n, c_n) = self.lstm(x)Output shape: (seq_len, batch, hidden_size * 2).Caution
Bidirectional RNNs cannot be used for online (real‑time) prediction because they require future inputs. Use standard RNNs for streaming applications.
Two Minute Drill
- BiRNNs process forward and backward.
- Output combines both directions, capturing full context.
- Use when entire sequence is known (offline).
- Double the hidden size (concatenation).
Need more clarification?
Drop us an email at career@quipoinfotech.com
