老潘的AI社区

SparK: the first successful BERT/MAE-style pretraining on any convolutional networks

大模型, 预训练

imoldpan 2023 年7 月 6 日 03:20 1

Different masking strategies

Different masking strategies1056×483 125 KB

Sparse masked modeling with hierarchy

Sparse masked modeling with hierarchy1048×558 151 KB

Using sparse convolution to address “mask pattern vanishing” issue.

Using sparse convolution to address “mask pattern vanishing” issue.1039×567 113 KB

参考

https://github.com/keyu-tian/spark

首页
类别
准则
服务条款
隐私政策

由 Discourse 提供技术支持，启用 JavaScript 以获得最佳体验