Author:
Xie Zhiqing,Jiang Xue,Peng Meihua,Yang Ziheng
Abstract
Abstract
This paper presents a hardware acceleration design for convolutional neural networks. Floating-point fixed-point operations, pipeline interlayer parallel acceleration, and design space exploration are the three key areas of optimization, and optimized modules can be used to build various networks with convolutions according to specifications for the application scenario, thus achieving a universal design. The experimental results show that the optimization of hardware resources improves the speed and performance of the algorithm, and can withstand larger data volumes and higher real-time requirements. The system achieves an accuracy of 95.09% and an inference speed of 0.237 ms per image, with a high processing speed. As a result, convolutional neural networks may now be used in a wider variety of application scenarios and manage larger datasets and higher real-time demands thanks to the design solutions presented in this research.
Subject
Computer Science Applications,History,Education