doubleHelix: nucleic acid sequence identification, assignment and validation tool for cryo-EM and crystal structure models-Reference-Cited by-同舟云学术

doubleHelix: nucleic acid sequence identification, assignment and validation tool for cryo-EM and crystal structure models

Published:2023-02-17 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Chojnowski Grzegorz^ORCID

Abstract

ABSTRACTSequence assignment is a key step of the model building process in both cryogenic electron microscopy (cryo-EM) and macromolecular crystallography (MX). If the assignment fails, it can result in difficult to identify errors affecting the interpretation of a model. There are many model validation strategies that help experimentalists in this step of protein model building, but they are virtually non-existent for nucleic acids. Here I present doubleHelix – a comprehensive method for assignment, identification, and validation of nucleic acid sequences in structures determined using cryo-EM and MX. The method combines a neural network classifier of nucleobase identities and a sequence-independent secondary structure assignment approach. I show that the presented method can successfully assist model building at lower resolutions, where visual map interpretation is very difficult. Moreover, I present examples of sequence assignment errors detected using doubleHelix in cryo-EM and MX structures of ribosomes deposited in the Protein Data Bank, which escaped the scrutiny of available model-validation approaches.The doubleHelix program source code is available under BSD-3 license athttps://gitlab.com/gchojnowski/doublehelix.

Publisher

Cold Spring Harbor Laboratory

Reference63 articles.

1. The Protein Data Bank

2. Ensemble cryo-EM reveals conformational states of the nsp13 helicase in the SARS-CoV-2 helicase replication-transcription complex;Nat Struct Mol Biol,2022

3. Protein structure predictions to atomic accuracy with AlphaFold

4. Accurate prediction of protein structures and interactions using a three-track neural network

5. AI revolutions in biology: The joys and perils of AlphaFold;EMBO Rep,2021