Asymptotic Convergence Properties of the EM Algorithm for Mixture of Experts

Author:

Yang Yan1,Ma Jinwen1

Affiliation:

1. Department of Information Science, School of Mathematical Sciences and LMAM, Peking University, Beijing, 100871, China

Abstract

Mixture of experts (ME) is a modular neural network architecture for supervised classification. The double-loop expectation-maximization (EM) algorithm has been developed for learning the parameters of the ME architecture, and the iteratively reweighted least squares (IRLS) algorithm and the Newton-Raphson algorithm are two popular schemes for learning the parameters in the inner loop or gating network. In this letter, we investigate asymptotic convergence properties of the EM algorithm for ME using either the IRLS or Newton-Raphson approach. With the help of an overlap measure for the ME model, we obtain an upper bound of the asymptotic convergence rate of the EM algorithm in each case. Moreover, we find that for the Newton approach as a specific Newton-Raphson approach to learning the parameters in the inner loop, the upper bound of asymptotic convergence rate of the EM algorithm locally around the true solution Θ* is [Formula: see text], where ϵ>0 is an arbitrarily small number, o(x) means that it is a higher-order infinitesimal as x → 0, and e(Θ*) is a measure of the average overlap of the ME model. That is, as the average overlap of the true ME model with large sample tends to zero, the EM algorithm with the Newton approach to learning the parameters in the inner loop tends to be asymptotically superlinear. Finally, we substantiate our theoretical results by simulation experiments.

Publisher

MIT Press - Journals

Subject

Cognitive Neuroscience,Arts and Humanities (miscellaneous)

Cited by 4 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. An effective EM algorithm for mixtures of Gaussian processes via the MCMC sampling and approximation;Neurocomputing;2019-02

2. A Two-Layer Mixture Model of Gaussian Process Functional Regressions and Its MCMC EM Algorithm;IEEE Transactions on Neural Networks and Learning Systems;2018-10

3. Mixture of feature specified experts;Information Fusion;2014-11

4. Twenty Years of Mixture of Experts;IEEE Transactions on Neural Networks and Learning Systems;2012-08

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3