Attention Visualizer
Self-attention mechanism explorer
Input sequence:
Tokenize
Load Example
Tokens (click to select source)
Attention Matrix (Head
1
)
Attention Heads
Attention Flow
Input
→
Q = XWq
Input
→
K = XWk
Input
→
V = XWv
A = softmax(QK^T/√d)
Output = AV
Settings
Number of heads
4
8
12
Head dimension
32
64
128
Temperature
1.0
Causal masking
0
1
Q/K/V Projections
Query (selected token)
Keys (all tokens)
Values (all tokens)