CCTSDB dataset enhancement based on a cross-augmentation method for image datasets-Reference-Cited by-同舟云学术

CCTSDB dataset enhancement based on a cross-augmentation method for image datasets

Published:2024-01-18 Issue: Volume: Page:1-19
ISSN:1088-467X
Container-title:Intelligent Data Analysis
language:
Short-container-title:IDA

Author:

Lin Xinrui¹²,Wang Wei³,Zhu Xiaohui¹,Yue Yong¹

Affiliation:

1. Xi’an Jiaotong-liverpool University, Suzhou, Jiangsu, China

2. The University of Edinburgh, Edinburgh, UK

3. Hebei Normal University, Shijiazhuang, Hebei, China

Abstract

In the digital era, the rapid advancement of artificial intelligence has put a spotlight on target detection, especially in traffic settings. This area of study is pivotal for crucial projects like autonomous vehicles, road monitoring, and traffic sign recognition. However, existing Chinese traffic datasets lack comprehensive benchmarks for traffic signs and signals, and foreign datasets do not match Chinese traffic conditions. Manually annotating a large-scale dataset tailored for Chinese traffic conditions presents a significant challenge. This study addresses this gap by proposing a cross-augmentation method for image datasets. We utilized YOLOX for target detection and trained models on the BDD100K dataset, achieving an impressive mAP of 60.25%, surpassing most algorithms. Leveraging transfer learning, we enhanced the CCTSDB dataset, creating the ACCTSDB dataset, which includes annotations for common traffic objects and Chinese traffic signs. Using YOLOX, we trained a traffic detector tailored for Chinese traffic scenarios, achieving an mAP of 75.79%. To further validate our approach, we conducted experiments on the TT100K dataset and successfully introduced the ATT100K dataset. Our methodology is poised to alleviate the limitations of manually annotating image datasets. The proposed ACCTSDB dataset and ATT100K dataset are expected to compensate for the lack of large-scale, multi-class traffic datasets in China.

Publisher

IOS Press

Reference23 articles.

1. R. Girshick, J. Donahue, T. Darrell and J. Malik, Rich feature hierarchies for accurate object detection and semantic segmentation, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 580–587.

2. R. Girshick, Fast r-cnn, in: Proceedings of the IEEE International Conference on Computer Vision, pp. 1440–1448.

3. S. Ren, K. He, R. Girshick and J. Sun, Faster r-cnn: Towards real-time object detection with region proposal networks, Advances in Neural Information Processing Systems 28 (2015).

4. K. He, G. Gkioxari, P. Dollár and R. Girshick, Mask r-cnn, in: Proceedings of the IEEE International Conference on Computer Vision, pp. 2961–2969.

5. J. Redmon, S. Divvala, R. Girshick and A. Farhadi, You only look once: Unified, real-time object detection, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788.