1. Alemi, A.A., Fischer, I., Dillon, J.V., Murphy, K.: Deep variational information bottleneck. arXiv: Learning (2016)
2. Asano, Y.M., Rupprecht, C., Vedaldi, A.: Self-labelling via simultaneous clustering and representation learning. arXiv preprint. arXiv:1911.05371 (2019)
3. Baevski, A., Hsu, W.N., Xu, Q., Babu, A., Gu, J., Auli, M.: Data2vec: a general framework for self-supervised learning in speech, vision and language. arXiv preprint. arXiv:2202.03555 (2022)
4. Bao, H., Dong, L., Wei, F.: Beit: bert pre-training of image transformers. arXiv: Computer Vision and Pattern Recognition (2021)
5. Lecture Notes in Computer Science;M Caron,2018