Differential gene expression analysis based on linear mixed model corrects false positive inflation for studying quantitative traits-Reference-Cited by-同舟云学术

Differential gene expression analysis based on linear mixed model corrects false positive inflation for studying quantitative traits

Published:2023-10-03 Issue:1 Volume:13 Page:
ISSN:2045-2322
Container-title:Scientific Reports
language:en
Short-container-title:Sci Rep

Author:

Tang Shizhen,Buchman Aron S.,Wang Yanling,Avey Denis,Xu Jishu,Tasaki Shinya,Bennett David A.,Zheng Qi,Yang Jingjing

Abstract

AbstractDifferential gene expression (DGE) analysis has been widely employed to identify genes expressed differentially with respect to a trait of interest using RNA sequencing (RNA-Seq) data. Recent RNA-Seq data with large samples pose challenges to existing DGE methods, which were mainly developed for dichotomous traits and small sample sizes. Especially, existing DGE methods are likely to result in inflated false positive rates. To address this gap, we employed a linear mixed model (LMM) that has been widely used in genetic association studies for DGE analysis of quantitative traits. We first applied the LMM method to the discovery RNA-Seq data of dorsolateral prefrontal cortex (DLPFC) tissue (n = 632) with four continuous measures of Alzheimer’s Disease (AD) cognitive and neuropathologic traits. The quantile–quantile plots of p-values showed that false positive rates were well calibrated by LMM, whereas other methods not accounting for sample-specific mixed effects led to serious inflation. LMM identified 37 potentially significant genes with differential expression in DLPFC for at least one of the AD traits, 17 of which were replicated in the additional RNA-Seq data of DLPFC, supplemental motor area, spinal cord, and muscle tissues. This application study showed not only well calibrated DGE results by LMM, but also possibly shared gene regulatory mechanisms of AD traits across different relevant tissues.

Funder

National Institutes of Health

National Institute of Health

Publisher

Springer Science and Business Media LLC

Subject

Multidisciplinary

Link

https://www.nature.com/articles/s41598-023-43686-7.pdf

Reference56 articles.

1. Behjati, S. & Tarpey, P. S. What is next generation sequencing?. Archiv. Dis. Childhood Educ. Pract. Edn. 98, 236–238. https://doi.org/10.1136/archdischild-2013-304340 (2013).

2. Reuter, J. A., Spacek, D. V. & Snyder, M. P. High-throughput sequencing technologies. Mol Cell 58, 586–597. https://doi.org/10.1016/j.molcel.2015.05.004 (2015).

3. Kukurba, K. R. & Montgomery, S. B. RNA sequencing and analysis. Cold Spring Harbor Protocols 2015, pdb.top084970 (2015).

4. Costa-Silva, J., Domingues, D. & Lopes, F. M. RNA-Seq differential expression analysis: An extended review and a software tool. PloS One 12, e0190152 (2017).

5. Young, M. D. et al. In Bioinformatics for High Throughput Sequencing (eds Rodríguez-Ezpeleta, N. et al.) 169–190 (Springer, 2012).

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. TEMINET: A Co-Informative and Trustworthy Multi-Omics Integration Network for Diagnostic Prediction;International Journal of Molecular Sciences;2024-01-29

2. TEMINET: A Co-Informative and Trustworthy Multi-Omics Integration Network for Diagnostic Prediction;2024-01-04