Abstract
AbstractMotivationCoiled-coil domains (CCD) are widespread in all organisms performing several crucial functions. Given their relevance, the computational detection of coiled-coil domains is very important for protein functional annotation. State-of-the art prediction methods include the precise identification of coiled-coil domain boundaries, the annotation of the typical heptad repeat pattern along the coiled-coil helices as well as the prediction of the oligomerization state.ResultsIn this paper we describe CoCoNat, a novel method for predicting coiled-coil helix boundaries, residue-level register annotation and oligomerization state. Our method encodes sequences with the combination of two state-of-the-art protein language models and implements a three-step deep learning procedure concatenated with a Grammatical-Restrained Hidden Conditional Random Field (GRHCRF) for CCD identification and refinement. A final neural network (NN) predicts the oligomerization state. When tested on a blind test set routinely adopted, CoCoNat obtains a performance superior to the current state-of-the-art both for residue-level and segment-level coiled-coil detection. CoCoNat significantly outperforms the most recent state-of-the art method on register annotation and prediction of oligomerization states.AvailabilityCoCoNat is available athttps://coconat.biocomp.unibo.it.Contactpierluigi.martelli@unibo.it
Publisher
Cold Spring Harbor Laboratory