A Benchmark for UAV-View Natural Language-Guided Tracking
-
Published:2024-04-28
Issue:9
Volume:13
Page:1706
-
ISSN:2079-9292
-
Container-title:Electronics
-
language:en
-
Short-container-title:Electronics
Author:
Li Hengyou1, Liu Xinyan1ORCID, Li Guorong1ORCID
Affiliation:
1. School of Computer Science and Technology, University of Chinese Academy of Sciences, Beijing 100049, China
Abstract
We propose a new benchmark, UAVNLT (Unmanned Aerial Vehicle Natural Language Tracking), for the UAV-view natural language-guided tracking task. UAVNLT consists of videos taken from UAV cameras from four cities for vehicles on city roads. For each video, vehicles’ bounding boxes, trajectories, and natural language are carefully annotated. Compared to the existing data sets, which are only annotated with bounding boxes, the natural language sentences in our data set can be more suitable for many application fields where humans take part in the system for that language, being not only more friendly for human–computer interaction but also capable of overcoming the appearance features’ low uniqueness for tracking. We tested several existing methods on our new benchmarks and found that the performance of the existing methods was not satisfactory. To pave the way for future work, we propose a baseline method suitable for this task, achieving state-of-the-art performance. We believe our new data set and proposed baseline method will be helpful in many fields, such as smart city, smart transportation, vehicle management, etc.
Funder
Key Deployment Program of the Chinese Academy of Sciences Fundamental Research Funds for Central Universities
Reference54 articles.
1. Shao, Y., Yang, Z., Li, Z., and Li, J. (2024). Aero-YOLO: An Efficient Vehicle and Pedestrian Detection Algorithm Based on Unmanned Aerial Imagery. Electronics, 13. 2. Hu, Q., Li, L., Duan, J., Gao, M., Liu, G., Wang, Z., and Huang, D. (2023). Object Detection Algorithm of UAV Aerial Photography Image Based on Anchor-Free Algorithms. Electronics, 12. 3. Yamani, A., Alyami, A., Luqman, H., Ghanem, B., and Giancola, S. (2024, January 4–8). Active Learning for Single-Stage Object Detection in UAV Images. Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Waikoloa, HI, USA. 4. Rizzoli, G., Barbato, F., Caligiuri, M., and Zanuttigh, P. (2023, January 2–3). SynDrone-Multi-Modal UAV Dataset for Urban Scenarios. Proceedings of the IEEE/CVF International Conference on Computer Vision, Paris, France. 5. Javed, S., Hassan, A., Ahmad, R., Ahmed, W., Ahmed, R., Saadat, A., and Guizani, M. (2024). State-of-the-Art and Future Research Challenges in UAV Swarms. IEEE Internet Things J.
|
|