POSS-CNN: An Automatically Generated Convolutional Neural Network with Precision and Operation Separable Structure Aiming at Target Recognition and Detection
-
Published:2023-11-07
Issue:11
Volume:14
Page:604
-
ISSN:2078-2489
-
Container-title:Information
-
language:en
-
Short-container-title:Information
Author:
Hou Jia1, Zhang Jingyu1, Chen Qi1, Xiang Siwei1, Meng Yishuo1, Wang Jianfei1, Lu Cimang2, Yang Chen1ORCID
Affiliation:
1. School of Microelectronics, Xi’an Jiaotong University, Xi’an 710049, China 2. Shenzhen Xinrai Sinovoice Technology Co., Ltd., Shenzhen 518000, China
Abstract
Artificial intelligence is changing and influencing our world. As one of the main algorithms in the field of artificial intelligence, convolutional neural networks (CNNs) have developed rapidly in recent years. Especially after the emergence of NASNet, CNNs have gradually pushed the idea of AutoML to the public’s attention, and large numbers of new structures designed by automatic searches are appearing. These networks are usually based on reinforcement learning and evolutionary learning algorithms. However, sometimes, the blocks of these networks are complex, and there is no small model for simpler tasks. Therefore, this paper proposes POSS-CNN aiming at target recognition and detection, which employs a multi-branch CNN structure with PSNC and a method of automatic parallel selection for super parameters based on a multi-branch CNN structure. Moreover, POSS-CNN can be broken up. By choosing a single branch or the combination of two branches as the “benchmark”, as well as the overall POSS-CNN, we can achieve seven models with different precision and operations. The test accuracy of POSS-CNN for a recognition task tested on a CIFAR10 dataset can reach 86.4%, which is equivalent to AlexNet and VggNet, but the operation and parameters of the whole model in this paper are 45.9% and 45.8% of AlexNet, and 29.5% and 29.4% of VggNet. The mAP of POSS-CNN for a detection task tested on the LSVH dataset is 45.8, inferior to the 62.3 of YOLOv3. However, compared with YOLOv3, the operation and parameters of the model in this paper are reduced by 57.4% and 15.6%, respectively. After being accelerated by WRA, POSS-CNN for a detection task tested on an LSVH dataset can achieve 27 fps, and the energy efficiency is 0.42 J/f, which is 5 times and 96.6 times better than GPU 2080Ti in performance and energy efficiency, respectively.
Funder
National Natural Science Foundation of China Shenzhen Park of Hetao Shenzhen–Hong Kong Science and Technology Innovation Cooperation Zone Program
Subject
Information Systems
Reference49 articles.
1. Zoph, B., Vasudevan, V., Shlens, J., and Le, Q.V. (2018, January 18–23). Learning Transferable Architectures for Separable Image Recognition. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA. 2. Larsson, G., Maire, M., and Shakhnarovich, G. (2017). FractalNet: Ultra-Deep Neural Networks without Residuals. International Conference on Neural Information. arXiv. 3. Centralized Feature Pyramid for Object Detection;Quan;IEEE Trans. Image Process.,2023 4. Pan, J., Li, Z., Wei, Y., Chen, Z., Nong, Y., Zhou, B., Huan, W., Zou, J., Pan, Z., and Liu, W. (2023, January 26–28). FEDNet: A real-time deep-learning framework for object detection. Proceedings of the 2023 IEEE 3rd International Conference on Electronic Technology, Communication and Information (ICETCI), Changchun, China. 5. Multi-Branch Cascade Receptive Field Residual Network;Zhang;IEEE Access,2023
|
|