Which pooling method is better: Max, Avg, or Concat (Max, Avg)
Abstract
Pooling is a non-linear operation that aggregates the results of a given region to a single value. This method effectively removes extraneous details in feature maps while keeping the overall information. As a result, the size of feature maps is reduced, which decreases computing costs and prevents overfitting by eliminating irrelevant data. In CNN models, the max pooling and average pooling methods are commonly utilized. The max pooling selects the highest value within the pooling area and aids in preserving essential features of the image. However, it ignores the other values inside the pooling region, resulting in a significant loss of information. The average pooling computes the average values within the pooling area, which reduces data loss. However, by failing to emphasize critical pixels in the image, it may result in the loss of significant features. To examine the performance of pooling methods, this study comprised the experimental analysis of multiple models, i.e. shallow and deep, datasets, i.e. Cifar10, Cifar100, and SVHN, and pool sizes, e.g. $2x2$, $3x3$, $10x10$. Furthermore, the study investigated the effectiveness of combining two approaches, namely Concat (Max, Avg), to minimize information loss. The findings of this work provide an important guideline for selecting pooling methods in the design of CNNs. The experimental results demonstrate that pooling methods have a considerable impact on model performance. Moreover, there are variances based on the model and pool size.
Publisher
Communications Faculty of Sciences University of Ankara Series A2-A3 Physical Sciences and Engineering
Reference34 articles.
1. Atas, I., Human gender prediction based on deep transfer learning from panoramic dental radiograph images, Trait. du Signal, 39 (5) (2022), 1585, http://dx.doi.org/10.18280/ts.390515. 2. Atas, M., Ozdemir, C., Atas, I., Ak, B., Ozeroglu, E, Biometric identification using panoramic
dental radiographic images withfew-shot learning, Turk. J. Electr. Eng., 30 (3) (2022), 1115-
1126, http://dx.doi.org/10.55730/1300-0632.3830. 3. Ozdemir, C., Gedik, M. A., Kaya, Y., Age estimation from left-hand radiographs with deep
learning methods, Trait. du Signal, 38 (6) (2021), http://dx.doi.org/10.18280/ts.380601. 4. Krizhevsky, A., Sutskever, I., Hinton, G. E., Imagenet classification with deep convolutional
neural networks, Commun. ACM, 60 (6) (2017), 84-90, http://dx.doi.org/10.1145/3065386. 5. Tolstikhin, I. O., Houlsby, N., Kolesnikov, A., Beyer, L., Zhai, X., Unterthiner, T., Dosovitskiy, A., Mlp-mixer: An all-mlp architecture for vision, Adv. Neural Inf. Process. Syst., 34 (2021), 24261-24272, https://arxiv.org/abs/2105.01601.
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
|
|