CAGNet: A Multi-Scale Convolutional Attention Method for Glass Detection Based on Transformer
-
Published:2023-09-26
Issue:19
Volume:11
Page:4084
-
ISSN:2227-7390
-
Container-title:Mathematics
-
language:en
-
Short-container-title:Mathematics
Author:
Hu Xiaohang1, Gao Rui1ORCID, Yang Seungjun2, Cho Kyungeun3ORCID
Affiliation:
1. Department of Multimedia Engineering, Dongguk University, 30, Pildongro-1-gil, Jung-gu, Seoul 04620, Republic of Korea 2. Electronics and Telecommunications Research Institute, 218 Gajeong-ro, Yuseong-gu, Daejeon 34129, Republic of Korea 3. Division of AI Software Convergence, Dongguk University, 30, Pildongro-1-gil, Jung-gu, Seoul 04620, Republic of Korea
Abstract
Glass plays a vital role in several fields, making its accurate detection crucial. Proper detection prevents misjudgments, reduces noise from reflections, and ensures optimal performance in other computer vision tasks. However, the prevalent usage of glass in daily applications poses unique challenges for computer vision. This study introduces a novel convolutional attention glass segmentation network (CAGNet) predicated on a transformer architecture customized for image glass detection. Based on the foundation of our prior study, CAGNet minimizes the number of training cycles and iterations, resulting in enhanced performance and efficiency. CAGNet is built upon the strategic design and integration of two types of convolutional attention mechanisms coupled with a transformer head applied for comprehensive feature analysis and fusion. To further augment segmentation precision, the network incorporates a custom edge-weighting scheme to optimize glass detection within images. Comparative studies and rigorous testing demonstrate that CAGNet outperforms several leading methodologies in glass detection, exhibiting robustness across a diverse range of conditions. Specifically, the IOU metric improves by 0.26% compared to that in our previous study and presents a 0.92% enhancement over those of other state-of-the-art methods.
Funder
the Electronics and Telecommunications Research Institute the Artificial Intelligence Convergence Innovation Human Resources Development
Subject
General Mathematics,Engineering (miscellaneous),Computer Science (miscellaneous)
Reference56 articles.
1. Gao, R., Li, M., Yang, S.-J., and Cho, K. (2022). Reflective Noise Filtering of Large-Scale Point Cloud Using Transformer. Remote Sens., 14. 2. Gao, R., Park, J., Hu, X., Yang, S., and Cho, K. (2021). Reflective noise filtering of large-scale point cloud using multi-position LiDAR sensing data. Remote Sens., 13. 3. Fu, J., Liu, J., Tian, H., Li, Y., Bao, Y., Fang, Z., and Lu, H. (2019, January 15–20). Dual attention network for scene segmentation. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA. 4. He, K., Gkioxari, G., Dollár, P., and Girshick, R. (2017, January 21–26). Mask r-cnn. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA. 5. Zhao, H., Shi, J., Qi, X., Wang, X., and Jia, J. (2017, January 21–26). Pyramid scene parsing network. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
|
|