Dataset of miRNA–disease relations extracted from textual data using transformer-based neural networks

Author:

Madan Sumit1ORCID,Kühnel Lisa23,Fröhlich Holger14,Hofmann-Apitius Martin14ORCID,Fluck Juliane235ORCID

Affiliation:

1. Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI) , Schloss Birlinghoven, 53757 Sankt Augustin, Germany

2. Knowledge Management, German National Library of Medicine (ZB MED)—Information Centre for Life Sciences , Friedrich-Hirzebruch-Allee 4, Bonn 53115, Germany

3. Graduate School DILS, Bielefeld Institute for Bioinformatics Infrastructure (BIBI), Faculty of Technology, Bielefeld University , Postfach 10 01 31, Bielefeld, Nordrhein-Westfalen 33501, Germany

4. Bonn-Aachen International Center for Information Technology (B-IT), University of Bonn , Friedrich-Hirzebruch-Allee 6, Bonn 53113, Germany

5. Information management, Institute of Geodesy and Geoinformation, University of Bonn , Katzenburgweg 1a, Bonn 53115, Germany

Abstract

Abstract MicroRNAs (miRNAs) play important roles in post-transcriptional processes and regulate major cellular functions. The abnormal regulation of expression of miRNAs has been linked to numerous human diseases such as respiratory diseases, cancer, and neurodegenerative diseases. Latest miRNA–disease associations are predominantly found in unstructured biomedical literature. Retrieving these associations manually can be cumbersome and time-consuming due to the continuously expanding number of publications. We propose a deep learning-based text mining approach that extracts normalized miRNA–disease associations from biomedical literature. To train the deep learning models, we build a new training corpus that is extended by distant supervision utilizing multiple external databases. A quantitative evaluation shows that the workflow achieves an area under receiver operator characteristic curve of 98% on a holdout test set for the detection of miRNA–disease associations. We demonstrate the applicability of the approach by extracting new miRNA–disease associations from biomedical literature (PubMed and PubMed Central). We have shown through quantitative analysis and evaluation on three different neurodegenerative diseases that our approach can effectively extract miRNA–disease associations not yet available in public databases. Database URL: https://zenodo.org/records/10523046

Funder

Bundesministerium für Bildung und Forschung

Publisher

Oxford University Press (OUP)

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3