Abstract
The distance-based linear model (DB-LM) extends the classical linear regression to the framework of mixed-type predictors or when the only available information is a distance matrix between regressors (as it sometimes happens with big data). The main drawback of these DB methods is their computational cost, particularly due to the eigendecomposition of the Gram matrix. In this context, ensemble regression techniques provide a useful alternative to fitting the model to the whole sample. This work analyzes the performance of three subsampling and aggregation techniques in DB regression on two specific large, real datasets. We also analyze, via simulations, the performance of bagging and DB logistic regression in the classification problem with mixed-type features and large sample sizes.
Funder
Ministerio de Ciencia y Tecnología
Subject
General Mathematics,Engineering (miscellaneous),Computer Science (miscellaneous)
Reference28 articles.
1. Distance-based local linear regression for functional predictors
2. Manifold–Manifold Distance and its Application to Face Recognition With Image Sets
3. Recognition of leaf image set based on manifold-manifold distance;Shao,2014
4. The Shape of Data: Intrinsic Distance for Data Distributionshttps://arxiv.org/abs/1905.11141
5. Distance analysis in discrimination and classification using both continuous and categorical variables;Cuadras,1989
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献