Attention-Based Multi-Modal Fusion Network for Semantic Scene Completion-Reference-Cited by-同舟云学术

Attention-Based Multi-Modal Fusion Network for Semantic Scene Completion

Published:2020-04-03 Issue:07 Volume:34 Page:11402-11409
ISSN:2374-3468
Container-title:Proceedings of the AAAI Conference on Artificial Intelligence
language:
Short-container-title:AAAI

Author:

Li Siqi,Zou Changqing,Li Yipeng,Zhao Xibin,Gao Yue

Abstract

This paper presents an end-to-end 3D convolutional network named attention-based multi-modal fusion network (AMFNet) for the semantic scene completion (SSC) task of inferring the occupancy and semantic labels of a volumetric 3D scene from single-view RGB-D images. Compared with previous methods which use only the semantic features extracted from RGB-D images, the proposed AMFNet learns to perform effective 3D scene completion and semantic segmentation simultaneously via leveraging the experience of inferring 2D semantic segmentation from RGB-D images as well as the reliable depth cues in spatial dimension. It is achieved by employing a multi-modal fusion architecture boosted from 2D semantic segmentation and a 3D semantic completion network empowered by residual attention blocks. We validate our method on both the synthetic SUNCG-RGBD dataset and the real NYUv2 dataset and the results show that our method respectively achieves the gains of 2.5% and 2.6% on the synthetic SUNCG-RGBD dataset and the real NYUv2 dataset against the state-of-the-art method.

Publisher

Association for the Advancement of Artificial Intelligence (AAAI)

Subject

General Medicine

Cited by 29 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. MFAP-Net: A Study on Deep Learning Semantic Segmentation Framework for LiDAR Point Clouds;2023 8th International Conference on Control, Robotics and Cybernetics (CRC);2024-12-22

2. Convolutional laplacian gaussian pyramid approach multimodal medical image fusion;Multimedia Tools and Applications;2024-08-05

3. Voxel- and Bird’s-Eye-View-Based Semantic Scene Completion for LiDAR Point Clouds;Remote Sensing;2024-06-21

4. OCC-VO: Dense Mapping via 3D Occupancy-Based Visual Odometry for Autonomous Driving;2024 IEEE International Conference on Robotics and Automation (ICRA);2024-05-13

5. IEFM and IDS: Enhancing 3D environment perception via information encoding in indoor point cloud semantic segmentation;Neurocomputing;2024-01