A Novel Multi-Data-Augmentation and Multi-Deep-Learning Framework for Counting Small Vehicles and Crowds-Reference-Cited by-同舟云学术

A Novel Multi-Data-Augmentation and Multi-Deep-Learning Framework for Counting Small Vehicles and Crowds

Published:2024-02 Issue:02 Volume:38 Page:
ISSN:0218-0014
Container-title:International Journal of Pattern Recognition and Artificial Intelligence
language:en
Short-container-title:Int. J. Patt. Recogn. Artif. Intell.

Author:

Tsai Chun-Ming¹,Shih Frank Y.²³^ORCID

Affiliation:

1. Department of Computer Science, University of Taipei, Taipei 100, Taiwan

2. Department of Computer Science, New Jersey Institute of Technology, Newark, NJ 07102, USA

3. Department of Computer Science and Information Engineering, Asia University, Taichung 413, Taiwan

Abstract

Counting small pixel-sized vehicles and crowds in unmanned aerial vehicles (UAV) images is crucial across diverse fields, including geographic information collection, traffic monitoring, item delivery, communication network relay stations, as well as target segmentation, detection, and tracking. This task poses significant challenges due to factors such as varying view angles, non-fixed drone cameras, small object sizes, changing illumination, object occlusion, and image jitter. In this paper, we introduce a novel multi-data-augmentation and multi-deep-learning framework designed for counting small vehicles and crowds in UAV images. The framework harnesses the strengths of specific deep-learning detection models, coupled with the convolutional block attention module and data augmentation techniques. Additionally, we present a new method for detecting cars, motorcycles, and persons with small pixel sizes. Our proposed method undergoes evaluation on the test dataset v2 of the 2022 AI Cup competition, where we secured the first place on the private leaderboard by achieving the highest harmonic mean. Subsequent experimental results demonstrate that our framework outperforms the existing YOLOv7-E6E model. We also conducted comparative experiments using the publicly available VisDrone datasets, and the results show that our model outperforms the other models with the highest AP50 score of 52%.

Funder

Ministry of Science and Technology, Taiwan

Publisher

World Scientific Pub Co Pte Ltd

Link

https://www.worldscientific.com/doi/pdf/10.1142/S0218001424520013

Reference29 articles.

1. A review on object detection in unmanned aerial vehicle surveillance

2. Cascade R-CNN: Delving Into High Quality Object Detection

3. Parallel Residual Bi-Fusion Feature Pyramid Network for Accurate Single-Shot Object Detection

4. Small object detection combining attention mechanism and a novel FPN

5. Drone-Based Object Counting by Spatially Regularized Regional Proposal Network