DLRAPom: a hybrid pipeline of Optimized XGBoost-guided integrative multiomics analysis for identifying targetable disease-related lncRNA–miRNA–mRNA regulatory axes

Author:

Shen Chen1,Li Huiyu2,Li Miao3,Niu Yu4,Liu Jing5,Zhu Li2,Gui Hongsheng6,Han Wei2,Wang Huiying2,Zhang Wenpei2,Wang Xiaochen2,Luo Xiao5,Sun Yu7,Yan Jiangwei8,Guan Fanglin1ORCID

Affiliation:

1. Shanghai Key Laboratory of Forensic Medicine, Academy of Forensic Science; Key Laboratory of National Ministry of Health for Forensic Sciences, School of Medicine & Forensics, Health Science Center, Xi’an Jiaotong University, Xi’an, China

2. Key Laboratory of National Ministry of Health for Forensic Sciences, School of Medicine & Forensics, Health Science Center, Xi’an Jiaotong University, Xi’an, China

3. Department of Ultrasound, the Second Affiliated Hospital, Xi’an Jiaotong University, Xi’an, China

4. Department of Endocrinology and Metabolism, Ninth Hospital of Xi’an City, Xi’an, China

5. Department of Physiology and Pathophysiology, School of Basic Medical Sciences, Health Science Center, Xi’an Jiaotong University, Xi’an, China

6. Center for Behavior Health and Psychiatry Research, Henry Ford Health System, Detroit, MI, USA

7. Department of Endocrinology and Metabolism, Qilu Hospital of Shandong University, Ji’nan, China

8. Department of Genetics, School of Medicine & Forensics, Shanxi Medical University, Taiyuan, China

Abstract

Abstract The lack of a reliable and easy-to-operate screening pipeline for disease-related noncoding RNA regulatory axis is a problem that needs to be solved urgently. To address this, we designed a hybrid pipeline, disease-related lncRNA–miRNA–mRNA regulatory axis prediction from multiomics (DLRAPom), to identify risk biomarkers and disease-related lncRNA–miRNA–mRNA regulatory axes by adding a novel machine learning model on the basis of conventional analysis and combining experimental validation. The pipeline consists of four parts, including selecting hub biomarkers by conventional bioinformatics analysis, discovering the most essential protein-coding biomarkers by a novel machine learning model, extracting the key lncRNA–miRNA–mRNA axis and validating experimentally. Our study is the first one to propose a new pipeline predicting the interactions between lncRNA and miRNA and mRNA by combining WGCNA and XGBoost. Compared with the methods reported previously, we developed an Optimized XGBoost model to reduce the degree of overfitting in multiomics data, thereby improving the generalization ability of the overall model for the integrated analysis of multiomics data. With applications to gestational diabetes mellitus (GDM), we predicted nine risk protein-coding biomarkers and some potential lncRNA–miRNA–mRNA regulatory axes, which all correlated with GDM. In those regulatory axes, the MALAT1/hsa-miR-144-3p/IRS1 axis was predicted to be the key axis and was identified as being associated with GDM for the first time. In short, as a flexible pipeline, DLRAPom can contribute to molecular pathogenesis research of diseases, effectively predicting potential disease-related noncoding RNA regulatory networks and providing promising candidates for functional research on disease pathogenesis.

Funder

National Natural Scientific Foundation of China

Shaanxi Province Innovative Talent Promotion Plan-Youth Project

Shanghai Key Laboratory of Forensic Medicine

Academy of Forensic Science

Fundamental Research Funds for the Central Universities

Publisher

Oxford University Press (OUP)

Subject

Molecular Biology,Information Systems

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3