Visual Robot Relocalization Based on Multi-Task CNN and Image-Similarity Strategy-Reference-Cited by-同舟云学术

Visual Robot Relocalization Based on Multi-Task CNN and Image-Similarity Strategy

Published:2020-12-04 Issue:23 Volume:20 Page:6943
ISSN:1424-8220
Container-title:Sensors
language:en
Short-container-title:Sensors

Author:

Xie Tao^ORCID,Wang Ke^ORCID,Li Ruifeng,Tang Xinyue

Abstract

The traditional CNN for 6D robot relocalization which outputs pose estimations does not interpret whether the model is making sensible predictions or just guessing at random. We found that convnet representations trained on classification problems generalize well to other tasks. Thus, we propose a multi-task CNN for robot relocalization, which can simultaneously perform pose regression and scene recognition. Scene recognition determines whether the input image belongs to the current scene in which the robot is located, not only reducing the error of relocalization but also making us understand with what confidence we can trust the prediction. Meanwhile, we found that when there is a large visual difference between testing images and training images, the pose precision becomes low. Based on this, we present the dual-level image-similarity strategy (DLISS), which consists of two levels: initial level and iteration-level. The initial level performs feature vector clustering in the training set and feature vector acquisition in testing images. The iteration level, namely, the PSO-based image-block selection algorithm, can select the testing images which are the most similar to training images based on the initial level, enabling us to gain higher pose accuracy in testing set. Our method considers both the accuracy and the robustness of relocalization, and it can operate indoors and outdoors in real time, taking at most 27 ms per frame to compute. Finally, we used the Microsoft 7Scenes dataset and the Cambridge Landmarks dataset to evaluate our method. It can obtain approximately 0.33 m and 7.51∘ accuracy on 7Scenes dataset, and get approximately 1.44 m and 4.83∘ accuracy on the Cambridge Landmarks dataset. Compared with PoseNet, our CNN reduced the average positional error by 25% and the average angular error by 27.79% on 7Scenes dataset, and reduced the average positional error by 40% and the average angular error by 28.55% on the Cambridge Landmarks dataset. We show that our multi-task CNN can localize from high-level features and is robust to images which are not in the current scene. Furthermore, we show that our multi-task CNN gets higher accuracy of relocalization by using testing images obtained by DLISS.

Publisher

MDPI AG

Subject

Electrical and Electronic Engineering,Biochemistry,Instrumentation,Atomic and Molecular Physics, and Optics,Analytical Chemistry

Link

https://www.mdpi.com/1424-8220/20/23/6943/pdf

Reference33 articles.

1. Relocalization With Submaps: Multi-Session Mapping for Planetary Rovers Equipped With Stereo Cameras

2. A theory of the evolution of technology: Technological parasitism and the implications for innovation magement

4. 6D Relocalisation for RGBD Cameras Using Synthetic View Regression

Cited by 4 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. OFVL-MS: Once for Visual Localization across Multiple Indoor Scenes;2023 IEEE/CVF International Conference on Computer Vision (ICCV);2023-10-01

2. UFVL-Net: A Unified Framework for Visual Localization Across Multiple Indoor Scenes;IEEE Transactions on Instrumentation and Measurement;2023

3. A Deep Feature Aggregation Network for Accurate Indoor Camera Localization;IEEE Robotics and Automation Letters;2022-04

4. Impact of High-Tech Image Formats Based on Full-Frame Sensors on Visual Experience and Film-Television Production;Wireless Communications and Mobile Computing;2021-08-27