High-throughput cryo-ET structural pattern mining by unsupervised deep iterative subtomogram clustering

Author:

Zeng Xiangrui1,Kahng Anson2,Xue Liang34,Mahamid Julia3ORCID,Chang Yi-Wei5ORCID,Xu Min1ORCID

Affiliation:

1. Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA 15213

2. Computer Science Department, University of Rochester, Rochester, NY 14620

3. Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg 69117, Germany

4. Faculty of Biosciences, Collaboration for joint PhD degree between European Molecular Biology Laboratory and Heidelberg University, Heidelberg 69117, Germany

5. Department of Biochemistry and Biophysics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104

Abstract

Cryoelectron tomography directly visualizes heterogeneous macromolecular structures in their native and complex cellular environments. However, existing computer-assisted structure sorting approaches are low throughput or inherently limited due to their dependency on available templates and manual labels. Here, we introduce a high-throughput template-and-label-free deep learning approach, Deep Iterative Subtomogram Clustering Approach (DISCA), that automatically detects subsets of homogeneous structures by learning and modeling 3D structural features and their distributions. Evaluation on five experimental cryo-ET datasets shows that an unsupervised deep learning based method can detect diverse structures with a wide range of molecular sizes. This unsupervised detection paves the way for systematic unbiased recognition of macromolecular complexes in situ.

Funder

HHS | NIH | National Institute of General Medical Sciences

NSF | BIO | Division of Biological Infrastructure

NSF | CISE | Division of Information and Intelligent Systems

Mark Foundation For Cancer Research

European Molecular Biology Laboratory

David and Lucile Packard Foundation

Advanced Micro Devices

CMU | SCS | Center for Machine Learning and Health, School of Computer Science, Carnegie Mellon University

NSF | BIO | Division of Molecular and Cellular Biosciences

Publisher

Proceedings of the National Academy of Sciences

Subject

Multidisciplinary

全球学者库

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"全球学者库"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前全球学者库共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2023 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3