Affiliation:
1. National Institute of Technical Teachers’ Training and Research (NITTTR) , Chandigarh - 160019 , India
Abstract
Abstract
Convolutional neural networks (CNN) is a contemporary technique for computer vision applications, where pooling implies as an integral part of the deep CNN. Besides, pooling provides the ability to learn invariant features and also acts as a regularizer to further reduce the problem of overfitting. Additionally, the pooling techniques significantly reduce the computational cost and training time of networks which are equally important to consider. Here, the performances of pooling strategies on different datasets are analyzed and discussed qualitatively. This study presents a detailed review of the conventional and the latest strategies which would help in appraising the readers with the upsides and downsides of each strategy. Also, we have identified four fundamental factors namely network architecture, activation function, overlapping and regularization approaches which immensely affect the performance of pooling operations. It is believed that this work would help in extending the scope of understanding the significance of CNN along with pooling regimes for solving computer vision problems.
Reference57 articles.
1. [1] Reddit, Machine Learning. Available: https://www.reddit.com/r/MachineLearning/comments/2lmo0l/ama_geoffrey_hinton/clyj4jv/
2. [2] Achille A., Soatto S., Information dropout: Learning optimal representations through noisy computation, IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 2897-2905.10.1109/TPAMI.2017.2784440
3. [3] Boureau Y.-L., Le Roux N., Bach F., Ponce J., LeCun Y., Ask the locals: multi-way local pooling for image recognition, in Computer Vision (ICCV), 2011 IEEE International Conference on, 2011, 2651-2658.10.1109/ICCV.2011.6126555
4. [4] Cai M., Shi Y., Liu J., Stochastic pooling maxout networks for low-resource speech recognition, in Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on, 2014, 3266-3270.10.1109/ICASSP.2014.6854204
5. [5] Cheng Y., Zhao X., Cai R., Li Z., Huang K., Rui Y., Semi-Supervised Multimodal Deep Learning for RGB-D Object Recognition, in IJCAI, 2016, 3345-3351.
Cited by
30 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献