Transformers Neural Network — Interactive Visualizer

Interactive Visualization • Live Training • GPT vs T5

Loss: 0.000
Epoch: 0
Mode: GPT
Heads: 4
Input
Encoder
Decoder
Output
Click on a token to inspect multi-head attention and gradients.