Affiliation:
1. State Key Laboratory of Precision Space-time Information Sensing Technology
2. Ministry of Education
Abstract
Softmax, a pervasive nonlinear operation, plays a pivotal role in numerous statistics and deep learning (DL) models such as ChatGPT. To compute it is expensive especially for at-scale models. Several software and hardware speed-up strategies are proposed but still suffer from low efficiency, poor scalability. Here we propose a photonic-computing solution including massive programmable neurons that is capable to execute such operation in an accurate, computation-efficient, robust and scalable manner. Experimental results show our diffraction-based computing system exhibits salient generalization ability in diverse artificial and real-world tasks (mean square error <10−5). We further analyze its performances against several realistic restricted factors. Such flexible system not only contributes to optimizing Softmax operation mechanism but may provide an inspiration of manufacturing a plug-and-play module for general optoelectronic accelerators.
Funder
Beijing Natural Science Foundation