A Fistful of Words: Learning Transferable Visual Models from Bag-of-Words Supervision

Publication
arXiv preprint arXiv:2112.13884
Date
Links