Affiliation:
1. Department of Computer Science and Computer Engineering, University of Texas at Arlington, Arlington, TX 76019, USA
Abstract
This article presents a method for extracting high-level semantic information through successful landmark detection using 2D RGB images. In particular, the focus is placed on the presence of particular labels (open path, humans, staircase, doorways, obstacles) in the encountered scene, which can be a fundamental source of information enhancing scene understanding and paving the path towards the safe navigation of the mobile unit. Experiments are conducted using a manual wheelchair to gather image instances from four indoor academic environments consisting of multiple labels. Afterwards, the fine-tuning of a pretrained vision transformer (ViT) is conducted, and the performance is evaluated through an ablation study versus well-established state-of-the-art deep architectures for image classification such as ResNet. Results show that the fine-tuned ViT outperforms all other deep convolutional architectures while achieving satisfactory levels of generalization.
Reference41 articles.
1. A survey of traversability estimation for mobile robots;Sevastopoulos;IEEE Access,2022
2. A comprehensive review of smart wheelchairs: Past, present, and future;Leaman;IEEE Trans. Hum.-Mach. Syst.,2017
3. All-terrain wheelchair: Increasing personal mobility with a powered wheel-track hybrid wheelchair;Podobnik;IEEE Robot. Autom. Mag.,2017
4. A visual servoing approach for autonomous corridor following and doorway passing in a wheelchair;Pasteau;Robot. Auton. Syst.,2016
5. Delmerico, J.A., Baran, D., David, P., Ryde, J., and Corso, J.J. (2013, January 6–10). Ascending stairway modeling from dense depth imagery for traversability analysis. Proceedings of the 2013 IEEE International Conference on Robotics and Automation, Karlsruhe, Germany.
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献