Study of the Principal Components Method Modifications Resistance to Abnormal Observations

Author:

Goryainov V.B.1,Goryainova E.R.2

Affiliation:

1. Bauman Moscow State Technical University

2. National Research University Higher School of Economics

Abstract

The paper considers the problem of reducing multidimensional correlated indicators. One of the approaches to solving this problem is based on the method of principal components, which makes it possible to compactly describe the vector with correlated coordinates (components) using the principal components vector with uncorrelated coordinates of much smaller dimension, while retaining most of the information about correlation structure of the original vector. On simulated and real data, several modifications of the principal components method were compared differing in the method of evaluating correlation matrix of the observation vector. The work objective is to demonstrate advantages of the robust modifications of the principal components method in cases, where data contained the abnormal values. To compare the considered modifications on the model data, metric was introduced that measured the difference between estimated and true eigenvalues of the initial data correlation matrix. This metric behavior depending on the probability distribution of observations was studied by computer simulation. As the distributions, multivariate distributions with the off-diagonal correlation matrices simulating a polluted sample were selected. Next, a sample of 13 correlated socioeconomic indicators for 85 countries was considered, where 46 abnormal values were identified. The considered modifications of the principal components method chose the same optimal number of principal components equal to three. However, the real data compression quality, which was defined as the share of the initial indicators total variance described by the first three principal components, turned out to be significantly higher for the robust modifications of the principal components method. Results obtained on these real data are in good agreement with conclusions of the computer simulation

Publisher

Bauman Moscow State Technical University

Subject

General Physics and Astronomy,General Engineering,General Mathematics,General Chemistry,General Computer Science

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3