MoNA Bench: A Benchmark for Monocular Depth Estimation in Navigation of Autonomous Unmanned Aircraft System
Author:
Pan Yongzhou12ORCID, Liu Binhong1, Liu Zhen1, Shen Hao1ORCID, Xu Jianyu1, Fu Wenxing1, Yang Tao1ORCID
Affiliation:
1. Unmanned System Research Institute, Northwestern Polytechnical University, Xi’an 710072, China 2. School of Aeronautics, Northwestern Polytechnical University, Xi’an 710072, China
Abstract
Efficient trajectory and path planning (TPP) is essential for unmanned aircraft systems (UASs) autonomy in challenging environments. Despite the scale ambiguity inherent in monocular vision, characteristics like compact size make a monocular camera ideal for micro-aerial vehicle (MAV)-based UASs. This work introduces a real-time MAV system using monocular depth estimation (MDE) with novel scale recovery module for autonomous navigation. We present MoNA Bench, a benchmark for Monocular depth estimation in Navigation of the Autonomous unmanned Aircraft system (MoNA), emphasizing its obstacle avoidance and safe target tracking capabilities. We highlight key attributes—estimation efficiency, depth map accuracy, and scale consistency—for efficient TPP through MDE.
Funder
National Natural Science Foundation of China Fundamental Research Funds for the Central Universities, NPU
Reference20 articles.
1. Pan, Y., Wang, J., Chen, F., Lin, Z., Zhang, S., and Yang, T. (2022, January 23–25). How Does Monocular Depth Estimation Work for MAV Navigation in the Real World?. Proceedings of the 2022 International Conference on Autonomous Unmanned Systems (ICAUS 2022), Xi’an, China. 2. Eigen, D., Puhrsch, C., and Fergus, R. (2014, January 8–13). Depth map prediction from a single image using a multi-scale deep network. Proceedings of the 27th International Conference on Neural Information Processing Systems (NIPS’14), Montreal, QC, Canada. 3. Alhashim, I., and Wonka, P. (2018). High quality monocular depth estimation via transfer learning. arXiv. 4. Huang, G., Liu, Z., Van Der Maaten, L., and Weinberger, K.Q. (2017, January 21–26). Densely connected convolutional networks. Proceedings of the IEEE Conference on Computer Vision and pattern Recognition, Honolulu, HI, USA. 5. Silberman, N., Hoiem, D., Kohli, P., and Fergus, R. (2012, January 7–13). Indoor segmentation and support inference from rgbd images. Proceedings of the 12th European Conference on Computer Vision, Florence, Italy.
|
|