VIDVIP: Dataset for Object Detection During Sidewalk Travel-Reference-Cited by-同舟云学术

VIDVIP: Dataset for Object Detection During Sidewalk Travel

Published:2021-10-20 Issue:5 Volume:33 Page:1135-1143
ISSN:1883-8049
Container-title:Journal of Robotics and Mechatronics
language:en
Short-container-title:J. Robot. Mechatron.

Author:

Baba Tetsuaki,

Abstract

In this paper, we report on the “VIsual Dataset for Visually Impaired Persons” (VIDVIP), a dataset for obstacle detection during sidewalk travel. In recent years, there have been many reports on assistive technologies using deep learning and computer vision technologies; nevertheless, developers cannot implement the corresponding applications without datasets. Although a number of open-source datasets have been released by research institutes and companies, large-scale datasets are not as abundant in the field of disability support, owing to their high development costs. Therefore, we began developing a dataset for outdoor mobility support for the visually impaired in April 2018. As of May 1, 2021, we have annotated 538,747 instances for 32,036 images in 39 classes of labels. We have implemented and tested navigation systems and other applications that utilize our dataset. In this study, we first compare our dataset with other general-purpose datasets, and show that our dataset has properties similar to those of datasets for automated driving. As a result of the discussion on the characteristics of the dataset, it is shown that the nature of the image shooting location, rather than the regional characteristics, tends to affect the annotation ratio. Accordingly, it is possible to examine the type of location based on the nature of the shooting location, and to infer the maintenance statuses of traffic facilities (such as Braille blocks) from the annotation ratio.

Funder

Japan Society for the Promotion of Science

Publisher

Fuji Technology Press Ltd.

Subject

Electrical and Electronic Engineering,General Computer Science

Reference23 articles.

1. A. Kuznetsova, H. Rom, N. Alldrin, J. Uijlings, I. Krasin, J. Pont-Tuset, S. Kamali, S. Popov, M. Malloci, T. Duerig, and V. Ferrari, “The open images dataset v4: Unified image classification, object detection, and visual relationship detection at scale,” Int. J. of Computer Vision, Vol.128, pp. 1956-1981, 2020.

2. T.-Y. Lin, M. Maire, S. J. Belongie, L. D. Bourdev, R. B. Girshick, J. Hays, P. Perona, D. Ramanan, P. Dollár, and C. L. Zitnick, “Microsoft COCO: common objects in context,” CoRR, abs/1405.0312, 2014.

3. N. Thakurdesai, A. Tripathi, D. Butani, and S. Sankhe, “Vision: A deep learning approach to provide walking assistance to the visually impaired,” CoRR, abs/1911.08739, 2019.

4. A. Geiger, P. Lenz, C. Stiller, and R. Urtasun, “Vision meets robotics: The kitti dataset,” Int. J. of Robotics Research (IJRR), 2013.

5. T. Baba, H. Watanave, and T. Kamae, “Design and prototyping for an outdoor activity support system for the visually impaired using deep learning for object detection,” SIG Technical Reports, IPSJ, Vol.32018-AAC-7, No.8, Aug. 2018 (in Japanese).