Smooth Quantile Normalization-Reference-Cited by-同舟云学术

Smooth Quantile Normalization

Published:2016-11-02 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Hicks Stephanie C^ORCID,Okrah Kwame,Paulson Joseph N^ORCID,Quackenbush John,Irizarry Rafael A^ORCID,Corrada Bravo Héctor^ORCID

Abstract

AbstractBetween-sample normalization is a critical step in genomic data analysis to remove systematic bias and unwanted technical variation in high-throughput data. Global normalization methods are based on the assumption that observed variability in global properties is due to technical reasons and are unrelated to the biology of interest. For example, some methods correct for differences in sequencing read counts by scaling features to have similar median values across samples, but these fail to reduce other forms of unwanted technical variation. Methods such as quantile normalization transform the statistical distributions across samples to be the same and assume global differences in the distribution are induced by only technical variation. However, it remains unclear how to proceed with normalization if these assumptions are violated, for example if there are global differences in the statistical distributions between biological conditions or groups, and external information, such as negative or control features, is not available. Here we introduce a generalization of quantile normalization, referred to as smooth quantile normalization (qsmooth), which is based on the assumption that the statistical distribution of each sample should be the same (or have the same distributional shape) within biological groups or conditions, but allowing that they may differ between groups. We illustrate the advantages of our method on several high-throughput datasets with global differences in distributions corresponding to different biological conditions. We also perform a Monte Carlo simulation study to illustrate the bias-variance tradeoff of qsmooth compared to other global normalization methods. A software implementation is available from https://github.com/stephaniehicks/qsmooth.

Publisher

Cold Spring Harbor Laboratory

Reference38 articles.

1. Normalization of RNA-Sequencing Data from Samples with Varying mRNA Levels

2. Amaratunga, Dhammika , and Javier Cabrera . 2001. “Outlier Resistance, Standardization, and Modeling Issues for DNA Microarray Data.” In Statistics in Genetics and in the Environmental Sciences, edited by Luisa Turrin Fernholz , Stephan Morgenthaler , and Werner Stahel , 17–26. Trends in Mathematics. Birkhauser Basel.

3. Differential expression analysis for sequence count data

4. Minfi: a flexible and comprehensive Bioconductor package for the analysis of Infinium DNA methylation microarrays

5. A comparison of normalization methods for high density oligonucleotide array data based on variance and bias

Cited by 8 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Connectivity of variants in eQTL networks dictates reproducibility and functionality;2019-01-09

2. Analysis and correction of compositional bias in sparse sequencing count data;BMC Genomics;2018-11-06

3. Exploring regulation in tissues with eQTL networks;Proceedings of the National Academy of Sciences;2017-08-29

4. Analysis and correction of compositional bias in sparse sequencing count data;2017-05-26

5. Identifying core biological processes distinguishing human eye tissues with precise systems-level gene expression analyses and weighted correlation networks;2017-05-11