Asymptotic Convergence Properties of the EM Algorithm for Mixture of Experts-Reference-Cited by-同舟云学术

Asymptotic Convergence Properties of the EM Algorithm for Mixture of Experts

Published:2011-08 Issue:8 Volume:23 Page:2140-2168
ISSN:0899-7667
Container-title:Neural Computation
language:en
Short-container-title:Neural Computation

Author:

Yang Yan¹,Ma Jinwen¹

Affiliation:

1. Department of Information Science, School of Mathematical Sciences and LMAM, Peking University, Beijing, 100871, China

Abstract

Mixture of experts (ME) is a modular neural network architecture for supervised classification. The double-loop expectation-maximization (EM) algorithm has been developed for learning the parameters of the ME architecture, and the iteratively reweighted least squares (IRLS) algorithm and the Newton-Raphson algorithm are two popular schemes for learning the parameters in the inner loop or gating network. In this letter, we investigate asymptotic convergence properties of the EM algorithm for ME using either the IRLS or Newton-Raphson approach. With the help of an overlap measure for the ME model, we obtain an upper bound of the asymptotic convergence rate of the EM algorithm in each case. Moreover, we find that for the Newton approach as a specific Newton-Raphson approach to learning the parameters in the inner loop, the upper bound of asymptotic convergence rate of the EM algorithm locally around the true solution Θ* is [Formula: see text], where ϵ>0 is an arbitrarily small number, o(x) means that it is a higher-order infinitesimal as x → 0, and e(Θ*) is a measure of the average overlap of the ME model. That is, as the average overlap of the true ME model with large sample tends to zero, the EM algorithm with the Newton approach to learning the parameters in the inner loop tends to be asymptotically superlinear. Finally, we substantiate our theoretical results by simulation experiments.

Publisher

MIT Press - Journals

Subject

Cognitive Neuroscience,Arts and Humanities (miscellaneous)

Link

https://www.mitpressjournals.org/doi/pdf/10.1162/NECO_a_00154

Reference17 articles.

1. Improved learning algorithms for mixture of experts in multiclass classification

2. Adaptive Mixtures of Local Experts

Cited by 4 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. An effective EM algorithm for mixtures of Gaussian processes via the MCMC sampling and approximation;Neurocomputing;2019-02

2. A Two-Layer Mixture Model of Gaussian Process Functional Regressions and Its MCMC EM Algorithm;IEEE Transactions on Neural Networks and Learning Systems;2018-10

3. Mixture of feature specified experts;Information Fusion;2014-11

4. Twenty Years of Mixture of Experts;IEEE Transactions on Neural Networks and Learning Systems;2012-08