Sparse Distillation: Speeding Up Text Classification by Using Bigger Models

Publication
arXiv preprint arXiv:2110.08536
Date
Links