Using Segmentation to Boost Classification Performance and Explainability in CapsNets-Reference-Cited by-同舟云学术

Using Segmentation to Boost Classification Performance and Explainability in CapsNets

Published:2024-06-28 Issue:3 Volume:6 Page:1439-1465
ISSN:2504-4990
Container-title:Machine Learning and Knowledge Extraction
language:en
Short-container-title:MAKE

Author:

Vranay Dominik¹^ORCID,Hliboký Maroš¹^ORCID,Kovács László²,Sinčák Peter¹²

Affiliation:

1. Department of Cybernetics and Artificial Intelligence, Faculty of Electrical Engineering and Informatics, Technical University of Košice, 042 52 Košice, Slovakia

2. Faculty of Mechanical Engineering and Informatics, University of Miskolc, 3515 Miskolc, Hungary

Abstract

In this paper, we present Combined-CapsNet (C-CapsNet), a novel approach aimed at enhancing the performance and explainability of Capsule Neural Networks (CapsNets) in image classification tasks. Our method involves the integration of segmentation masks as reconstruction targets within the CapsNet architecture. This integration helps in better feature extraction by focusing on significant image parts while reducing the number of parameters required for accurate classification. C-CapsNet combines principles from Efficient-CapsNet and the original CapsNet, introducing several novel improvements such as the use of segmentation masks to reconstruct images and a number of tweaks to the routing algorithm, which enhance both classification accuracy and interoperability. We evaluated C-CapsNet using the Oxford-IIIT Pet and SIIM-ACR Pneumothorax datasets, achieving mean F1 scores of 93% and 67%, respectively. These results demonstrate a significant performance improvement over traditional CapsNet and CNN models. The method’s effectiveness is further highlighted by its ability to produce clear and interpretable segmentation masks, which can be used to validate the network’s focus during classification tasks. Our findings suggest that C-CapsNet not only improves the accuracy of CapsNets but also enhances their explainability, making them more suitable for real-world applications, particularly in medical imaging.

Funder

Slovak National Science Foundation project

European Union’s Horizon 2020 research and innovation programme

Publisher

MDPI AG

Link

https://www.mdpi.com/2504-4990/6/3/68/pdf

Reference56 articles.

1. Imagenet classification with deep convolutional neural networks;Krizhevsky;Commun. ACM,2017

2. Belongie, S., Carson, C., Greenspan, H., and Malik, J. (1998, January 7). Color-and texture-based image segmentation using EM and its application to content-based image retrieval. Proceedings of the 6th International Conference on Computer Vision (IEEE Cat. No. 98CH36271), Bombay, India.

3. A general regression neural network;Specht;IEEE Trans. Neural Netw.,1991

4. Robust speech emotion recognition using CNN+ LSTM based on stochastic fractal search optimization algorithm;Abdelhamid;IEEE Access,2022

5. Detection of Non-Stationary GW Signals in High Noise From Cohen’s Class of Time–Frequency Representations Using Deep Learning;Lopac;IEEE Access,2021