Advanced Feature Learning on Point Clouds Using Multi-Resolution Features and Learnable Pooling-Reference-Cited by-同舟云学术

Advanced Feature Learning on Point Clouds Using Multi-Resolution Features and Learnable Pooling

Published:2024-05-21 Issue:11 Volume:16 Page:1835
ISSN:2072-4292
Container-title:Remote Sensing
language:en
Short-container-title:Remote Sensing

Author:

Wijaya Kevin Tirta¹,Paek Dong-Hee²^ORCID,Kong Seung-Hyun²^ORCID

Affiliation:

1. Computer Graphics Department, Max Planck Institute for Informatics, 66123 Saarbrücken, Germany

2. CCS Graduate School of Mobility, Korea Advanced Institute of Science and Technology, Daejeon 34051, Republic of Korea

Abstract

Existing point cloud feature learning networks often learn high-semantic point features representing the global context by incorporating sampling, neighborhood grouping, neighborhood-wise feature learning, and feature aggregation. However, this process may result in a substantial loss of granular information due to the sampling operation and the widely-used max pooling feature aggregation, which neglects information from non-maximum point features. Consequently, the resulting high-semantic point features could be insufficient to represent the local context, hindering the network’s ability to distinguish fine shapes. To address this problem, we propose PointStack, a novel point cloud feature learning network that utilizes multi-resolution feature learning and learnable pooling (LP). PointStack aggregates point features of various resolutions across multiple layers to capture both high-semantic and high-resolution information. The LP function calculates the weighted sum of multi-resolution point features through an attention mechanism with learnable queries, enabling the extraction of all available information. As a result, PointStack can effectively represent both global and local contexts, allowing the network to comprehend both the global structure and local shape details. PointStack outperforms various existing feature learning networks for shape classification and part segmentation on the ScanObjectNN and ShapeNetPart datasets, achieving 87.2% overall accuracy and instance mIoU.

Funder

National Research Foundation of Korea

Publisher

MDPI AG

Link

https://www.mdpi.com/2072-4292/16/11/1835/pdf

Reference50 articles.

1. Qi, C.R., Su, H., Mo, K., and Guibas, L.J. (2017, January 21–26). Pointnet: Deep learning on point sets for 3d classification and segmentation. Proceedings of the IEEE conference on computer Vision and Pattern Recognition, Honolulu, HI, USA.

2. Yang, Z., Sun, Y., Liu, S., and Jia, J. (2020, January 13–19). 3dssd: Point-based 3d single stage object detector. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA.

3. Yu, X., Rao, Y., Wang, Z., Liu, Z., Lu, J., and Zhou, J. (2021, January 10–17). Pointr: Diverse point cloud completion with geometry-aware transformers. Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada.

4. Graham, B., Engelcke, M., and Van Der Maaten, L. (2018, January 18–23). 3d semantic segmentation with submanifold sparse convolutional networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.

5. Yan, Y., Mao, Y., and Li, B. (2018). Second: Sparsely embedded convolutional detection. Sensors, 18.