Affiliation:
1. Faculty of Mathematics and Mechanics, Lomonosov Moscow State University, Leninskie Gory 1, 119991 Moscow, Russia
Abstract
The suboptimal procedure under consideration, based on the MDR-EFE algorithm, provides sequential selection of relevant (in a sense) factors affecting the studied, in general, non-binary random response. The model is not assumed linear, the joint distribution of the factors vector and response is unknown. A set of relevant factors has specified cardinality. It is proved that under certain conditions the mentioned forward selection procedure gives a random set of factors that asymptotically (with probability tending to one as the number of observations grows to infinity) coincides with the “oracle” one. The latter means that the random set, obtained with this algorithm, approximates the features collection that would be identified, if the joint distribution of the features vector and response were known. For this purpose the statistical estimators of the prediction error functional of the studied response are proposed. They involve a new version of regularization. This permits to guarantee not only the central limit theorem for normalized estimators, but also to find the convergence rate of their first two moments to the corresponding moments of the limiting Gaussian variable.
Reference54 articles.
1. Seber, G.A.F., and Lee, A.J. (2003). Linear Regression Analysis, J.Wiley and Sons Publication. [2nd ed.].
2. Györfi, L., Kohler, M., Krzyz˙ak, A., and Walk, H. (2002). A Distribution-Free Theory of Nonparametric Regression, Springer.
3. Matloff, N. (2017). Statistical Regression and Classification. From Linear Models to Machine Learning, CRC Press.
4. Regression shrinkage and selection via the lasso;Tibshirani;J. R. Stat. Soc. Ser. B Methodol.,1996
5. Hastie, T., Tibshirani, R., and Wainwrigth, R. (2015). Statistical Learning with Sparsity. The Lasso and Generalizations, CRC Press.
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献