Domain-adaptive neural networks improve cross-species prediction of transcription factor binding-Reference-Cited by-同舟云学术

Domain-adaptive neural networks improve cross-species prediction of transcription factor binding

Published:2022-01-18 Issue:3 Volume:32 Page:512-523
ISSN:1088-9051
Container-title:Genome Research
language:en
Short-container-title:Genome Res.

Author:

Cochran Kelly^ORCID,Srivastava Divyanshi^ORCID,Shrikumar Avanti,Balsubramani Akshay,Hardison Ross C.^ORCID,Kundaje Anshul^ORCID,Mahony Shaun^ORCID

Abstract

The intrinsic DNA sequence preferences and cell type–specific cooperative partners of transcription factors (TFs) are typically highly conserved. Hence, despite the rapid evolutionary turnover of individual TF binding sites, predictive sequence models of cell type–specific genomic occupancy of a TF in one species should generalize to closely matched cell types in a related species. To assess the viability of cross-species TF binding prediction, we train neural networks to discriminate ChIP-seq peak locations from genomic background and evaluate their performance within and across species. Cross-species predictive performance is consistently worse than within-species performance, which we show is caused in part by species-specific repeats. To account for this domain shift, we use an augmented network architecture to automatically discourage learning of training species–specific sequence features. This domain adaptation approach corrects for prediction errors on species-specific repeats and improves overall cross-species model performance. Our results show that cross-species TF binding prediction is feasible when models account for domain shifts driven by species-specific repeats.

Funder

National Institutes of Health

National Institute of General Medical Sciences

National Science Foundation

NIH

NIGMS

Stanford Graduate Fellowship

National Institute of Diabetes and Digestive and Kidney Diseases

Publisher

Cold Spring Harbor Laboratory

Subject

Genetics (clinical),Genetics