1. Natural gradient works efficiently in learning;Amari;Neural Computation,1998
2. Any target function exists in a neighborhood of any sufficiently wide random network: A geometrical perspective;Amari;Neural Computation,2020
3. Amari, S., Karakida, R., & Oizumi, M. (2019). Fisher information and natural gradient learning in random deep networks. In Proc. of the twenty-second international conference on artificial intelligence and statistics, vol. PMLR 89 (pp. 694–702).
4. Nonnegative matrices and applications;Bapat,1997
5. Minimum complexity density estimation;Barron;IEEE Transaction on Information Theory,1991