Generalization of Neural Networks on Second-Order Hypercomplex Numbers
-
Published:2023-09-19
Issue:18
Volume:11
Page:3973
-
ISSN:2227-7390
-
Container-title:Mathematics
-
language:en
-
Short-container-title:Mathematics
Author:
Pavlov Stanislav123ORCID, Kozlov Dmitry1ORCID, Bakulin Mikhail1ORCID, Zuev Aleksandr1ORCID, Latyshev Andrey1, Beliaev Alexander1
Affiliation:
1. NN AI Team, Huawei Russian Research Institute, St. Maksima Gorkogo, 117, Nizhny Novgorod 603006, Russia 2. Department of Informatics, Mathematics and Computer Sciences, National Research University Higher School of Economics, St. Bolshaya Pecherskaya, 25/12, Nizhny Novgorod 603155, Russia 3. Department of Informatics and Telecommunications, Volga State University of Water Transport, St. Nesterova, 5, Nizhny Novgorod 603005, Russia
Abstract
The vast majority of existing neural networks operate by rules set within the algebra of real numbers. However, as theoretical understanding of the fundamentals of neural networks and their practical applications grow stronger, new problems arise, which require going beyond such algebra. Various tasks come to light when the original data naturally have complex-valued formats. This situation is encouraging researchers to explore whether neural networks based on complex numbers can provide benefits over the ones limited to real numbers. Multiple recent works have been dedicated to developing the architecture and building blocks of complex-valued neural networks. In this paper, we generalize models by considering other types of hypercomplex numbers of the second order: dual and double numbers. We developed basic operators for these algebras, such as convolution, activation functions, and batch normalization, and rebuilt several real-valued networks to use them with these new algebras. We developed a general methodology for dual and double-valued gradient calculations based on Wirtinger derivatives for complex-valued functions. For classical computer vision (CIFAR-10, CIFAR-100, SVHN) and signal processing (G2Net, MusicNet) classification problems, our benchmarks show that the transition to the hypercomplex domain can be helpful in reaching higher values of metrics, compared to the original real-valued models.
Subject
General Mathematics,Engineering (miscellaneous),Computer Science (miscellaneous)
Reference24 articles.
1. Kaynak, O., Alpaydin, E., Oja, E., and Xu, L. (2003, January 26–29). The Computational Power of Complex-Valued Neuron. Proceedings of the Artificial Neural Networks and Neural Information Processing—ICANN/ICONIP 2003, Istanbul, Turkey. 2. Arjovsky, M., Shah, A., and Bengio, Y. (2015, January 6–11). Unitary Evolution Recurrent Neural Networks. Proceedings of the 33rd International Conference on Machine Learning 2015, Lille, France. 3. Bassey, J., Qian, L., and Li, X. (2021). A Survey of Complex-Valued Neural Networks. arXiv. 4. SurReal: Complex-Valued Learning as Principled Transformations on a Scaling and Rotation Manifold;Chakraborty;IEEE Trans. Neural Netw. Learn. Syst.,2020 5. Singhal, U., Xing, Y., and Yu, S.X. (2022, January 18–24). Co-Domain Symmetry for Complex-Valued Deep Learning. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022, New Orleans, LA, USA.
|
|