Abstract
In recent years, deep learning has significantly advanced image classification. However, as models grow more complex, their computational demands increase, prompting a shift toward lightweight models. The MobileNet series, known for its efficiency in resource-constrained environments, is a prime example. Despite its popularity, performance comparisons among different MobileNet versions for specific tasks like cat and dog image classification remain underexplored. This study addresses this gap by evaluating MobileNetV1, MobileNetV2, MobileNetV3Large, and MobileNetV3Small on a Kaggle dataset containing over 1000 images. The dataset underwent preprocessing before training and testing. The paper assessed the models' classification accuracy and convergence speed through comparative analysis. Results indicate that MobileNetV2 outperforms the others, with superior accuracy and faster convergence, making it the preferred choice. MobileNetV1 also showed stable performance, while MobileNetV3Large's larger size led to overfitting issues. In conclusion, MobileNetV2's exceptional performance in cat and dog image classification suggests its broad applicability in resource-limited scenarios. This study provides valuable insights for deploying image classification models on mobile devices and other constrained environments. Future research should focus on further optimizing these models to enhance their performance and generalization capabilities.
Reference12 articles.
1. [1] Lecun Y, Bottou L, Bengio Y, et al. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 1998, 86(11): 2278-2324.
2. [2] Szegedy C, Liu W, Yangqing J, et al. Going deeper with con-volutions. Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA, 2015: 1–9.
3. [3] He K, Zhang X, Ren S, et al. Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA, 2016: 770−778.
4. [4] Howard A G, Menglong Z, Chen B, et al. MobileNets: efficient convolutional neural networks for mobile vision applications. arXiv: 1704.04861, 2017.
5. [5] Sandler M, Howard A, Menglong Z, et al. MobileNetv2: inverted residuals and linear bottlenecks. Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA, 2018: 4510–4520.