1. Yolov4: Optimal speed and accuracy of object detection;Bochkovskiy,2020
2. Microsoft coco captions: Data collection and evaluation server;Chen,2015
3. Chen, J., Kao, S.-h., He, H., Zhuo, W., Wen, S., Lee, C.-H., et al. (2023). Run, Don’t Walk: Chasing Higher FLOPS for Faster Neural Networks. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 12021–12031).
4. Cubuk, E. D., Zoph, B., Shlens, J., & Le, Q. V. (2020). Randaugment: Practical automated data augmentation with a reduced search space. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops (pp. 702–703).
5. An image is worth 16x16 words: Transformers for image recognition at scale;Dosovitskiy,2020