Imitation Learning through Image Augmentation Using Enhanced Swin Transformer Model in Remote Sensing-Reference-Cited by-同舟云学术

Imitation Learning through Image Augmentation Using Enhanced Swin Transformer Model in Remote Sensing

Published:2023-08-24 Issue:17 Volume:15 Page:4147
ISSN:2072-4292
Container-title:Remote Sensing
language:en
Short-container-title:Remote Sensing

Author:

Park Yoojin¹,Sung Yunsick²^ORCID

Affiliation:

1. Department of Autonomous Things Intelligence Graduate School, Dongguk University-Seoul, Seoul 04620, Republic of Korea

2. Division of AI Software Convergence, Dongguk University-Seoul, Seoul 04620, Republic of Korea

Abstract

In unmanned systems, remote sensing is an approach that collects and analyzes data such as visual images, infrared thermal images, and LiDAR sensor data from a distance using a system that operates without human intervention. Recent advancements in deep learning enable the direct mapping of input images in remote sensing to desired outputs, making it possible to learn through imitation learning and for unmanned systems to learn by collecting and analyzing those images. In the case of autonomous cars, raw high-dimensional data are collected using sensors, which are mapped to the values of steering and throttle through a deep learning network to train imitation learning. Therefore, by imitation learning, the unmanned systems observe expert demonstrations and learn expert policies, even in complex environments. However, in imitation learning, collecting and analyzing a large number of images from the game environment incurs time and costs. Training with a limited dataset leads to a lack of understanding of the environment. There are some augmentation approaches that have the limitation of increasing the dataset because of considering only the locations of objects visited and estimated. Therefore, it is required to consider the diverse kinds of the location of objects not visited to solve the limitation. This paper proposes an enhanced model to augment the number of training images comprising a Preprocessor, an enhanced Swin Transformer model, and an Action model. Using the original network structure of the Swin Transformer model for image augmentation in imitation learning is challenging. Therefore, the internal structure of the Swin Transformer model is enhanced, and the Preprocessor and Action model are combined to augment training images. The proposed method was verified through an experimental process by learning from expert demonstrations and augmented images, which reduced the total loss from 1.24068 to 0.41616. Compared to expert demonstrations, the accuracy was approximately 86.4%, and the proposed method achieved 920 points and 1200 points more than the comparison model to verify generalization.

Funder

Ministry of Culture, Sports and Tourism

Publisher

MDPI AG

Subject

General Earth and Planetary Sciences

Link

https://www.mdpi.com/2072-4292/15/17/4147/pdf

Reference43 articles.

1. Jiang, Z., Li, S., and Sung, Y. (2022). Enhanced Evaluation Method of Musical Instrument Digital Interface Data based on Random Masking and Seq2Seq Model. Mathematics, 10.

2. Song, W., Li, D., Sun, S., Zhang, L., Yu, X., Choi, R., and Sung, Y. (2022). 2D&3DHNet for 3D Object Classification in LiDAR Point Cloud. Remote Sens., 14.

3. Yoon, H., Li, S., and Sung, Y. (2021). Style Transformation Method of Stage Background Images by Emotion Words of Lyrics. Mathematics, 9.

4. Balakrishna, A., Thananjeyan, B., Lee, J., Li, F., Zahed, A., Gonzalez, J.E., and Goldberg, K. (2020, January 16–18). On-policy robot imitation learning from a converging supervisor. Proceedings of the 3rd Conference on Robot Learning (CoRL), Virtual.

5. Jang, E., Irpan, A., Khansari, M., Kappler, D., Ebert, F., Lynch, C., Levine, S., and Finn, C. (2022, January 14–18). BC-Z: Zero-shot task generalization with robotic imitation learning. Proceedings of the 5th Conference on Robot Learning (CoRL), Auckland, New Zealand.

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. A Dual Convolutional Neural Network with Attention Mechanism for Thermal Infrared Image Enhancement;Electronics;2023-10-17