Unsupervised and self-supervised deep learning approaches for biomedical text mining-Reference-Cited by-同舟云学术

Unsupervised and self-supervised deep learning approaches for biomedical text mining

Published:2021-02-11 Issue:2 Volume:22 Page:1592-1603
ISSN:1467-5463
Container-title:Briefings in Bioinformatics
language:en
Short-container-title:

Author:

Nadif Mohamed¹,Role François¹

Affiliation:

1. Université de Paris, CNRS, Centre Borelli, France

Abstract

Abstract Biomedical scientific literature is growing at a very rapid pace, which makes increasingly difficult for human experts to spot the most relevant results hidden in the papers. Automatized information extraction tools based on text mining techniques are therefore needed to assist them in this task. In the last few years, deep neural networks-based techniques have significantly contributed to advance the state-of-the-art in this research area. Although the contribution to this progress made by supervised methods is relatively well-known, this is less so for other kinds of learning, namely unsupervised and self-supervised learning. Unsupervised learning is a kind of learning that does not require the cost of creating labels, which is very useful in the exploratory stages of a biomedical study where agile techniques are needed to rapidly explore many paths. In particular, clustering techniques applied to biomedical text mining allow to gather large sets of documents into more manageable groups. Deep learning techniques have allowed to produce new clustering-friendly representations of the data. On the other hand, self-supervised learning is a kind of supervised learning where the labels do not have to be manually created by humans, but are automatically derived from relations found in the input texts. In combination with innovative network architectures (e.g. transformer-based architectures), self-supervised techniques have allowed to design increasingly effective vector-based word representations (word embeddings). We show in this survey how word representations obtained in this way have proven to successfully interact with common supervised modules (e.g. classification networks) to whose performance they greatly contribute.

Publisher

Oxford University Press (OUP)

Subject

Molecular Biology,Information Systems

Link

http://academic.oup.com/bib/article-pdf/22/2/1592/36655075/bbab016.pdf

Reference91 articles.

1. Ensemble block co-clustering: a unified framework for text data;Affeldt,2020

2. Spectral clustering via ensemble deep autoencoder learning (SC-EDAE);Affeldt;Pattern Recogn,2020

3. Co-clustering document-term matrices by direct maximization of graph modularity;Ailem,2015

4. Graph modularity maximization as an effective method for co-clustering text data;Ailem;Knowl-Based Syst,2016

5. Model-based co-clustering for the effective handling of sparse data;Ailem;Pattern Recogn,2017

Cited by 48 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. A comprehensive survey for automatic text summarization: Techniques, approaches and perspectives;Neurocomputing;2024-10

2. Differentiation of granulomatous nodules with lobulation and spiculation signs from solid lung adenocarcinomas using a CT deep learning model;BMC Cancer;2024-07-22

3. LLM-Powered Natural Language Text Processing for Ontology Enrichment;Applied Sciences;2024-07-04

4. Data-driven interpretable analysis for polysaccharide yield prediction;Environmental Science and Ecotechnology;2024-05

5. Recent Advancements in Subcellular Proteomics: Growing Impact of Organellar Protein Niches on the Understanding of Cell Biology;Journal of Proteome Research;2024-03-07