1. Flamingo: a visual language model for few-shot learning;Alayrac;Advances in Neural Information Processing Systems,2022
2. data2vec: A general framework for self-supervised learning in speech, vision and language;Baevski,2022
3. Multimodal machine learning: A survey and taxonomy;Baltrušaitis;IEEE Transactions on Pattern Analysis and Machine Intelligence,2018
4. Probabilistic interpretation of feedforward classification network outputs, with relationships to statistical pattern recognition;Bridle,1990
5. Language models are few-shot learners;Brown,2020