Affiliation:
1. School of Automation, Beijing Information Science & Technology University, Beijing 100192, China
Abstract
Multi-scale object detection is critically important in complex driving environments within the field of autonomous driving. To enhance the detection accuracy of both small-scale and large-scale targets in complex autonomous driving environments, this paper proposes an improved YOLOv5-AFAM algorithm. Firstly, the Adaptive Fusion Attention Module (AFAM) and Down-sampling Module (DownC) are introduced to increase the detection precision of small targets. Secondly, the Efficient Multi-scale Attention Module (EMA) is incorporated, enabling the model to simultaneously recognize small-scale and large-scale targets. Finally, a Minimum Point Distance IoU-based Loss Function (MPDIou-LOSS) is introduced to improve the accuracy and efficiency of object detection. Experimental validation on the KITTI dataset shows that, compared to the baseline model, the improved algorithm increased precision by 2.4%, recall by 2.6%, mAP50 by 1.5%, and mAP50-90 by an impressive 4.8%.
Funder
National Natural Science Foundation of China
Reference31 articles.
1. Viola, P., and Jones, M. (2001, January 8–14). Rapid object detection using a boosted cascade of simple features. Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2001), Kauai, HI, USA.
2. Robust real-time face detection;Viola;Int. J. Comput. Vis.,2004
3. Felzenszwalb, P., McAllester, D., and Ramanan, D. (2008, January 23–28). A discriminatively trained, multiscale, deformable part model. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2008), Anchorage, AK, USA.
4. Felzenszwalb, P.F., Girshick, R.B., and McAllester, D. (2010, January 13–18). Cascade object detection with deformable part models. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2010), San Francisco, CA, USA.
5. Object detection with discriminatively trained part-based models;Felzenszwalb;IEEE Trans. Pattern Anal. Mach. Intell.,2010