Affiliation:
1. Beijing Information Science and Technology University, 100192 Beijing, People’s Republic of China
Abstract
Estimating position and stereographic geometry information of noncooperative spacecraft through three-dimensional (3-D) reconstruction techniques is of great significance. Depth information acquisition is an important component of 3-D reconstruction. Monocular cameras are cheaper and more widely used than depth sensors. The relative positions of noncooperative spacecraft and our spacecraft can be calculated from depth maps of monocular depth estimation and the camera parameters, providing data support for subsequent tracking and capture missions. This paper proposes a monocular depth estimation network combining the convolutional neural network (CNN) and a vision transformer (VIT) to improve the prediction accuracy of few-shot samples. We extract detail features and global features from the CNN and VIT encoders, respectively, and then fuse deep features and shallow features by a skip-connected upsampling decoder. Compared with the representative depth estimation algorithms in recent years on the NYU-Depth V2 dataset, the proposed network structure combines the advantages of the CNN and VIT as well as estimates the global depth of the scene more accurately while maintaining details. To solve the lack of spacecraft data collection, a new dataset is made from 3-D simulation models. Experiments on the self-made dataset demonstrate the feasibility of this method in aerospace engineering.
Funder
National Natural Science Foundation of China
RD Program of Beijing Municipal Education Commission
Key Research Training Project of Beijing Information Science and Technology University
Publisher
American Institute of Aeronautics and Astronautics (AIAA)
Subject
Electrical and Electronic Engineering,Computer Science Applications,Aerospace Engineering