Compositional Data Analysis using Kernels in mass cytometry data

Author:

Rudra Pratyaydipta1ORCID,Baxter Ryan2,Hsieh Elena W Y23,Ghosh Debashis4ORCID

Affiliation:

1. Department of Statistics, Oklahoms State University , Stillwater, OK 74078, USA

2. Department of Immunology and Microbiology, University of Colorado Anschutz Medical Campus , Aurora, CO 80045, USA

3. Department of Pediatrics, Section of Allergy and Immunology, University of Colorado Anschutz Medical Campus , Aurora, CO 80045, USA

4. Department of Biostatistics and Informatics, Colorado School of Public Health, University of Colorado Anschutz Medical Campus , Aurora, CO 80045, USA

Abstract

Abstract Motivation Cell-type abundance data arising from mass cytometry experiments are compositional in nature. Classical association tests do not apply to the compositional data due to their non-Euclidean nature. Existing methods for analysis of cell type abundance data suffer from several limitations for high-dimensional mass cytometry data, especially when the sample size is small. Results We proposed a new multivariate statistical learning methodology, Compositional Data Analysis using Kernels (CODAK), based on the kernel distance covariance (KDC) framework to test the association of the cell type compositions with important predictors (categorical or continuous) such as disease status. CODAK scales well for high-dimensional data and provides satisfactory performance for small sample sizes (n < 25). We conducted simulation studies to compare the performance of the method with existing methods of analyzing cell type abundance data from mass cytometry studies. The method is also applied to a high-dimensional dataset containing different subgroups of populations including Systemic Lupus Erythematosus (SLE) patients and healthy control subjects. Availability and implementation CODAK is implemented using R. The codes and the data used in this manuscript are available on the web at http://github.com/GhoshLab/CODAK/. Contact prudra@okstate.edu Supplementary information Supplementary data are available at Bioinformatics Advances online.

Funder

National Institute of Arthritis and Musculoskeletal and Skin Diseases

University of Colorado Cancer Center

Boettcher Foundation Webb-Waring Biomedical research

Publisher

Oxford University Press (OUP)

Subject

Cell Biology,Developmental Biology,Embryology,Anatomy

Reference78 articles.

1. Critical assessment of automated flow cytometry data analysis techniques;Aghaeepour;Nat. Methods,2013

2. The statistical analysis of compositional data;Aitchison;J. R. Stat. Soc. B,1982

3. Logratio analysis and compositional distance;Aitchison;Math. Geol,2000

4. Permutational multivariate analysis of variance (PERMANOVA);Anderson;Wiley Statsref,2014

5. An empirical comparison of permutation methods for tests of partial regression coefficients in a linear model;Anderson;J. Stat. Comput. Simul,1999

Cited by 3 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3