sc-REnF: An entropy guided robust feature selection for single-cell RNA-seq data

Author:

Lall Snehalika1,Ghosh Abhik2,Ray Sumanta34,Bandyopadhyay Sanghamitra1

Affiliation:

1. Machine Intelligence Unit, Indian Statistical Institute, Kolkata, 700108, West Bengal, India

2. Interdisciplinary Statistical Research Unit, Kolkata, 700108, West Bengal, India

3. Department of Computer Science and Engineering, Aliah University, Kolkata, India

4. Health Analytics Network, PA, USA

Abstract

Abstract Annotation of cells in single-cell clustering requires a homogeneous grouping of cell populations. Since single-cell data are susceptible to technical noise, the quality of genes selected prior to clustering is of crucial importance in the preliminary steps of downstream analysis. Therefore, interest in robust gene selection has gained considerable attention in recent years. We introduce sc-REnF [robust entropy based feature (gene) selection method], aiming to leverage the advantages of $R{\prime}{e}nyi$ and $Tsallis$ entropies in gene selection for single cell clustering. Experiments demonstrate that with tuned parameter ($q$), $R{\prime}{e}nyi$ and $Tsallis$ entropies select genes that improved the clustering results significantly, over the other competing methods. sc-REnF can capture relevancy and redundancy among the features of noisy data extremely well due to its robust objective function. Moreover, the selected features/genes can able to determine the unknown cells with a high accuracy. Finally, sc-REnF yields good clustering performance in small sample, large feature scRNA-seq data. Availability: The sc-REnF is available at https://github.com/Snehalikalall/sc-REnF

Funder

SyMeC Project

Department of Biotechnology

Publisher

Oxford University Press (OUP)

Subject

Molecular Biology,Information Systems

Reference53 articles.

1. Exponential scaling of single-cell rna-seq in the past decade;Svensson;Nat Protoc,2018

2. Eleven grand challenges in single-cell data science;Lähnemann;Genome Biol,2020

3. Seurat: visual analytics for the integrated analysis of microarray data;Gribov;BMC Med Genomics,2010

4. Sc3: consensus clustering of single-cell rna-seq data;Kiselev;Nat Methods,2017

5. A copula based topology preserving graph convolution network for clustering of single-cell RNA seq data,2021

Cited by 11 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. A review on advancements in feature selection and feature extraction for high-dimensional NGS data analysis;Functional & Integrative Genomics;2024-08-19

2. Introduction;Multiobjective Optimization Algorithms for Bioinformatics;2024

3. Single Cell RNA-Sequencing and Its Application in Livestock Animals;Systems Biology, Bioinformatics and Livestock Science;2023-11-07

4. On the use of QDE-SVM for gene feature selection and cell type classification from scRNA-seq data;PLOS ONE;2023-10-19

5. scFED: Clustering Identifying Cell Types of scRNA-Seq Data Based on Feature Engineering Denoising;Interdisciplinary Sciences: Computational Life Sciences;2023-07-04

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3