Abstract
AbstractQuality Control (QC) of samples is an essential preliminary step in cytometry data analysis. Notably, identification of potential batch effects and outlying samples is paramount, to avoid mistaking these effects for true biological signal in downstream analyses. However, this task can prove to be delicate and tedious, especially for datasets with many samples.Here, we presentCytoMDS, a Bioconductor package implementing a dedicated method for low dimensional representation of cytometry samples composed of marker expressions for up to millions of single cells. This method allows a global representation of all samples of a study, with one single point per sample, in such a way that projected distances can be visually interpreted. It usesEarth Mover’s Distancefor assessing dissimilarities between multi-dimensional distributions of marker expression, andMulti Dimensional Scalingfor low dimensional projection of distances. Some additional visualization tools, both for projection quality diagnosis and for user interpretation of the projection coordinates, are also provided in the package.We demonstrate the strengths and advantages ofCytoMDSfor QC of cytometry data on three real biological datasets, revealing the presence of low quality samples, batch effects and biological signal between sample groups.
Publisher
Cold Spring Harbor Laboratory
Reference37 articles.
1. Applied Multidimensional Scaling and Unfolding
2. Robust principal component analysis for accurate outlier sample detection in RNA-Seq data;In: BMC Bioinformatics,2020
3. Multidimensional Scaling Using Majorization: SMACOF in R;In: J. Stat. Softw,2009
4. How to Prepare Spectral Flow Cytometry Datasets for High Dimensional Data Analysis: A Practical Workflow;In: Front. Immunol,2021
5. Why Batch Effects Matter in Omics Data, and How to Avoid Them;In: Trends Biotechnol,2017