Algebraic Analysis for Nonidentifiable Learning Machines-Reference-Cited by-同舟云学术

Algebraic Analysis for Nonidentifiable Learning Machines

Published:2001-04-01 Issue:4 Volume:13 Page:899-933
ISSN:0899-7667
Container-title:Neural Computation
language:en
Short-container-title:Neural Computation

Author:

Watanabe Sumio¹

Affiliation:

1. P&I Laboratory, Tokyo Institute of Technology, Yokohama, 226-8503 Japan

Abstract

This article clarifies the relation between the learning curve and the algebraic geometrical structure of a nonidentifiable learning machine such as a multilayer neural network whose true parameter set is an analytic set with singular points. By using a concept in algebraic analysis, we rigorously prove that the Bayesian stochastic complexity or the free energy is asymptotically equal to λ1 logn − (m1 − 1) loglogn + constant, where n is the number of training samples and λ1 and m1 are the rational number and the natural number, which are determined as the birational invariant values of the singularities in the parameter space. Also we show an algorithm to calculate λ1 and m1 based on the resolution of singularities in algebraic geometry. In regular statistical models, 2λ1 is equal to the number of parameters and m1 = 1, whereas in nonregular models, such as multilayer networks, 2λ1 is not larger than the number of parameters and m1 ≥ 1. Since the increase of the stochastic complexity is equal to the learning curve or the generalization error, the nonidentifiable learning machines are better models than the regular ones if Bayesian ensemble learning is applied.

Publisher

MIT Press - Journals

Subject

Cognitive Neuroscience,Arts and Humanities (miscellaneous)

Link

https://www.mitpressjournals.org/doi/pdf/10.1162/089976601300014402

Reference28 articles.

1. A new look at the statistical model identification

2. A universal theorem on learning curves

3. Four Types of Learning Curves

4. Statistical Theory of Learning Curves under Entropic Loss Criterion

5. Resolution of Singularities and Division of Distributions

Cited by 150 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Consideration on the learning efficiency of multiple-layered neural networks with linear units;Neural Networks;2024-04

2. Upper Bound of Real Log Canonical Threshold of Tensor Decomposition and its Application to Bayesian Inference;Proceedings of the ISCIE International Symposium on Stochastic Systems Theory and its Applications;2024-04-01

3. Advanced Stochastic Sequences for Multidimensional Integrals Used in Neural Networks;Studies in Computational Intelligence;2024

4. EDI-Graphic: A Tool To Study Parameter Discrimination and Confirm Identifiability in Black-Box Models, and to Select Data-Generating Machines;Journal of Computational and Graphical Statistics;2023-06-12

5. Abstract Visual Reasoning: An Algebraic Approach for Solving Raven's Progressive Matrices;2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR);2023-06