Variable-selection ANOVA Simultaneous Component Analysis (VASCA)

Author:

Camacho José1ORCID,Vitale Raffaele2,Morales-Jiménez David1ORCID,Gómez-Llorente Carolina345

Affiliation:

1. Signal Theory, Networking and Communications Department, University of Granada , Granada 18014, Spain

2. University of Lille, CNRS, LASIRE (UMR 8516), Laboratoire Avancé de Spectroscopie pour les Interactions, la Réactivité et l’Environnement , Lille F-59000, France

3. Department of Biochemistry and Molecular Biology II, School of Pharmacy, Institute of Nutrition and Food Technology “José Mataix”, Biomedical Research Center, University of Granada , Granada 18160, Spain

4. Instituto de Investigación Biosanitaria, ibs.GRANADA , Granada, Spain

5. CIBEROBN (Physiopathology of Obesity and Nutrition CB12/03/30038), Instituto de Salud Carlos III , Madrid 28029, Spain

Abstract

Abstract Motivation ANOVA Simultaneous Component Analysis (ASCA) is a popular method for the analysis of multivariate data yielded by designed experiments. Meaningful associations between factors/interactions of the experimental design and measured variables in the dataset are typically identified via significance testing, with permutation tests being the standard go-to choice. However, in settings with large numbers of variables, like omics (genomics, transcriptomics, proteomics and metabolomics) experiments, the ‘holistic’ testing approach of ASCA (all variables considered) often overlooks statistically significant effects encoded by only a few variables (biomarkers). Results We hereby propose Variable-selection ASCA (VASCA), a method that generalizes ASCA through variable selection, augmenting its statistical power without inflating the Type-I error risk. The method is evaluated with simulations and with a real dataset from a multi-omic clinical experiment. We show that VASCA is more powerful than both ASCA and the widely adopted false discovery rate controlling procedure; the latter is used as a benchmark for variable selection based on multiple significance testing. We further illustrate the usefulness of VASCA for exploratory data analysis in comparison to the popular partial least squares discriminant analysis method and its sparse counterpart. Availability and implementation The code for VASCA is available in the MEDA Toolbox at https://github.com/josecamachop/MEDA-Toolbox (release v1.3). The simulation results and motivating example can be reproduced using the repository at https://github.com/josecamachop/VASCA/tree/v1.0.0 (DOI 10.5281/zenodo.7410623). Supplementary information Supplementary data are available at Bioinformatics online.

Funder

Agencia Andaluza del Conocimiento, Regional Government of Andalucía, in Spain

European Regional Development Fund

State Research Agency

Spain and the European Social Fund

AEI

Publisher

Oxford University Press (OUP)

Subject

Computational Mathematics,Computational Theory and Mathematics,Computer Science Applications,Molecular Biology,Biochemistry,Statistics and Probability

Reference29 articles.

1. Permutation tests for multi-factorial analysis of variance;Anderson;J. Stat. Comput. Simul,2003

2. Partial least squares for discrimination;Barker;J. Chemometr,2003

3. Controlling the false discovery rate: a practical and powerful approach to multiple testing;Benjamini;J. R. Stat. Soc. Ser. B (Methodological),1995

4. The control of the false discovery rate in multiple testing under dependency;Benjamini;Ann. Stat,2001

5. Application of near infrared (NIR) spectroscopy coupled to chemometrics for dried egg-pasta characterization and egg content quantification;Bevilacqua;Food Chem,2013

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3