老潘的AI社区
SparK: the first successful BERT/MAE-style pretraining on any convolutional networks
AI大模型
大模型
,
预训练
imoldpan
2023 年7 月 6 日 03:20
1
Different masking strategies
1056×483 125 KB
Sparse masked modeling with hierarchy
1048×558 151 KB
Using sparse convolution to address “mask pattern vanishing” issue.
1039×567 113 KB
参考
https://github.com/keyu-tian/spark