Affiliation:
1. Department of Statistics, The Chinese University of Hong Kong, Shatin, Hong Kong,
Abstract
Linear discriminant analysis has been widely applied in medical studies where atypical observations in a data set are usually encountered. While it is well known that the estimation in linear discriminant analysis can be conducted by using regression with dummy variates, typical regression diagnostic statistics cannot be applied to identify influential observations in discriminant analysis because these statistics are not invariant with regard to the codings of the dummy variates. We propose that regression model diagnostic measures developed from the local influence perspective can be used for identifying observations in a data set that exert undue influence on the result of the linear discriminant analysis. The measures are functions of the usual regression diagnostic statistics, such as leverage and residual, but are independent of the choice of the values of the dummy variate. They are local versions of Cook’s distance-type diagnostic statistic and the advantage of the measures lies in its ability in detecting a group rather than a single influential observation. The performance of the proposed measures are illustrated by analyses of three medical data sets and is compared with other diagnostic measures available in the literature. The results indicate that the proposed measures are simple and yet efficient discriminant diagnostic quantities. It is also observed from empirical evidence that a data point which is a multivariate outlier may not be influential in linear discriminant analysis.
Subject
Health Information Management,Statistics and Probability,Epidemiology
Cited by
11 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献