A improved pooling method for convolutional neural networks-Reference-Cited by-同舟云学术

A improved pooling method for convolutional neural networks

Published:2024-01-18 Issue:1 Volume:14 Page:
ISSN:2045-2322
Container-title:Scientific Reports
language:en
Short-container-title:Sci Rep

Author:

Zhao Lei,Zhang Zhonglin

Abstract

AbstractThe pooling layer in convolutional neural networks plays a crucial role in reducing spatial dimensions, and improving computational efficiency. However, standard pooling operations such as max pooling or average pooling are not suitable for all applications and data types. Therefore, developing custom pooling layers that can adaptively learn and extract relevant features from specific datasets is of great significance. In this paper, we propose a novel approach to design and implement customizable pooling layers to enhance feature extraction capabilities in CNNs. The proposed T-Max-Avg pooling layer incorporates a threshold parameter T, which selects the K highest interacting pixels as specified, allowing it to control whether the output features of the input data are based on the maximum values or weighted averages. By learning the optimal pooling strategy during training, our custom pooling layer can effectively capture and represent discriminative information in the input data, thereby improving classification performance. Experimental results show that the proposed T-Max-Avg pooling layer achieves good performance on three different datasets. When compared to LeNet-5 model with average pooling, max pooling, and Avg-TopK methods, the T-Max-Avg pooling method achieves the highest accuracy on CIFAR-10, CIFAR-100, and MNIST datasets.

Funder

National Natural Science Foundation of China

Science and Technology Program of Gansu Province

Publisher

Springer Science and Business Media LLC

Link

https://www.nature.com/articles/s41598-024-51258-6.pdf

Reference39 articles.

1. Jordan, M. I. & Mitchell, T. M. Machine learning: Trends, perspectives, and prospects. Science 349, 255–260 (2015).

2. Tayal, A. et al. Dl-cnn-based approach with image processing techniques for diagnosis of retinal diseases. Multim. Syst. 28(4), 1417–1438 (2021).

3. Batur Dinler, Ö. & Aydin, N. An optimal feature parameter set based on gated recurrent unit recurrent neural networks for speech segment detection. Appl. Sci. 10, 1273 (2020).

4. Abbas, Q. & Celebi, M. E. Dermodeep-a classification of melanoma-nevus skin lesions using multi-feature fusion of visual features and deep neural network. Multim. Tools Appl. 78, 23559–23580 (2019).