1. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
2. Dong, X., Yang, Y.: Nas-bench-201: extending the scope of reproducible neural architecture search. arXiv preprint arXiv:2001.00326 (2020)
3. Dudziak, L., Chau, T., Abdelfattah, M., Lee, R., Kim, H., Lane, N.: Brp-nas: prediction-based nas using gcns. Adv. Neural. Inf. Process. Syst. 33, 10480–10490 (2020)
4. He, K., Chen, X., Xie, S., Li, Y., Dollár, P., Girshick, R.: Masked autoencoders are scalable vision learners. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 16000–16009 (2022)
5. Hou, Z., et al.: Graphmae2: a decoding-enhanced masked self-supervised graph learner. In: Proceedings of the ACM Web Conference 2023, pp. 737–746 (2023)