Author:
Tang Li,Huang Wenjie,Hill Matthew C.,Ellinor Patrick T.,Li Min
Abstract
AbstractThe organization of the three-dimensional (3D) genome is a complex, and requires a plethora of proteins to ensure the proper formation and regulation of chromatin loops as well as higher order structures. Studying protein-mediated loop regulation can help unravel the intricate interplay between these loops and their crucial roles in modulating gene expression across different cellular contexts. However, current targeted chromatin conformation capture experiments face limitations in capturing protein-mediated loops across various cell types, and existing computational methods fail to predict diverse protein-mediated loops. To address these issues, we propose a fusion neural network (FusNet) designed for predicting protein-mediated loops. FusNet leverages genome sequence information, open chromatin, and ChIP-seq data to efficiently represent and analyze the positions of loop anchors. To extract informative features and reduce the complexity of FusNet, we constructed a convolutional neural network, which compresses the dimensionality of the features while also preserving the most significant ones. To enhance the accuracy and generalization capacity of FusNet, we built a fusion layer by stacking the prediction of fundamental models with a meta-model. FusNet demonstrated its effectiveness in predicting protein-mediated loops, exhibiting high consistency with Hi-C data. Moreover, we find that the loops output from FusNet are highly associated with regulatory functions. Through association analysis with genetic risk variants, FusNet further revealed its potential for unraveling disease-related mechanisms. In conclusion, our study offers a novel computational approach for predicting various protein-mediated chromatin loops, which could substantially enhance research on the functional significance of protein-mediated loop structures in diverse cellular contexts.Significance StatementThe intricate spatial organization of the three-dimensional (3D) genome involves functional proteins critically contributing to chromatin loop formation and regulation. Understanding these protein-mediated loops is vital for elucidating their influence on 3D genome architecture and gene regulation across different cellular types and disease-related contexts. In this study, we propose a Fusion Neural Network (FusNet) for predicting protein-mediated loops. FusNet can concurrently capture and analyze multiple protein-mediated loops in various cell types to advance our understanding of the multitude of protein-mediated loop structures and their functional significance. Importantly, through association analysis with risk variants, FusNet manifests potential in revealing disease-related mechanisms.
Publisher
Cold Spring Harbor Laboratory