老潘的AI社区
SparK: the first successful BERT/MAE-style pretraining on any convolutional networks
AI大模型
大模型
,
预训练
imoldpan
2023 年7 月 6 日 03:20
1
Different masking strategies
1056×483 125 KB
Sparse masked modeling with hierarchy
1048×558 151 KB
Using sparse convolution to address “mask pattern vanishing” issue.
1039×567 113 KB
参考
GitHub - keyu-tian/SparK: [ICLR'23 Spotlight] The first successful BERT/MAE-style pretraining on any convolutional network; Pytorch impl. of "Designing BERT for Convolutional Networks: Sparse and Hierarchical Masked Modeling"