老潘的AI社区
transformer结构简析
深度学习
transformer
,
attention
imoldpan
2023 年7 月 20 日 01:54
1
简化版教程,详细版可以看
这篇
。
self-attention
单个attention
MHA
image
1436×804 89.7 KB
decoder
Decoder stack output
869×561 36.4 KB
Cross-Attention
stable diffusion中使用
参考
https://sebastianraschka.com/blog/2023/self-attention-from-scratch.html
https://towardsdatascience.com/illustrated-self-attention-2d627e33b20a
The Illustrated Transformer – Jay Alammar – Visualizing machine learning one concept at a time.
This post is all you need(上卷)——层层剥开Transformer - 知乎