Abstract
AbstractDiscovering DNA regulatory sequence motifs and their relative positions are vital to understand the mechanisms of gene expression regulation. Such complicated motif grammars are difficult to be summarized from shallow models. Although Deep Convolutional Neural Network (DCNN) achieved great success in annotating cis-regulatory elements, few combinatorial motif grammars have been accurately interpreted due to the mixed signal in DCNN. To address this problem, we proposed NeuronMotif, a general backward decoupling algorithm, to reveal the homo-/hetero-typic motif combinations and arrangements embedded in convolutional neurons. We applied NeuronMotif on several widely-used DCNN models. Many uncovered motif grammars of deep convolutional neurons are supported by literature or ATAC-seq footprinting. We further diagnosed the sick neurons that are sensitive to adversarial noises, which can guide DCNN architecture optimization for better prediction performance and motif feature extraction. Overall, NeuronMotif enables decoding cis-regulatory codes from deep convolutional neurons and understanding DCNN from a novel perspective.
Publisher
Cold Spring Harbor Laboratory