DeepSATA: A Deep Learning-Based Sequence Analyzer Incorporating the Transcription Factor Binding Affinity to Dissect the Effects of Non-Coding Genetic Variants-Reference-Cited by-同舟云学术

DeepSATA: A Deep Learning-Based Sequence Analyzer Incorporating the Transcription Factor Binding Affinity to Dissect the Effects of Non-Coding Genetic Variants

Published:2023-07-27 Issue:15 Volume:24 Page:12023
ISSN:1422-0067
Container-title:International Journal of Molecular Sciences
language:en
Short-container-title:IJMS

Author:

Ma Wenlong¹²,Fu Yang¹²,Bao Yongzhou¹²³,Wang Zhen¹²³,Lei Bowen¹²⁴,Zheng Weigang¹²⁴,Wang Chao¹²⁴,Liu Yuwen¹²⁵

Affiliation:

1. Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Key Laboratory of Livestock and Poultry Multi-Omics of MARA, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518124, China

2. Innovation Group of Pig Genome Design and Breeding, Research Centre for Animal Genome, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518124, China

3. School of Life Sciences, Henan University, Kaifeng 475004, China

4. Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction of Ministry of Education & Key Lab of Swine Genetics and Breeding of Ministry of Agriculture and Rural Affairs, Huazhong Agricultural University, Wuhan 430070, China

5. Kunpeng Institute of Modern Agriculture at Foshan, Chinese Academy of Agricultural Sciences, Foshan 528226, China

Abstract

Utilizing large-scale epigenomics data, deep learning tools can predict the regulatory activity of genomic sequences, annotate non-coding genetic variants, and uncover mechanisms behind complex traits. However, these tools primarily rely on human or mouse data for training, limiting their performance when applied to other species. Furthermore, the limited exploration of many species, particularly in the case of livestock, has led to a scarcity of comprehensive and high-quality epigenetic data, posing challenges in developing reliable deep learning models for decoding their non-coding genomes. The cross-species prediction of the regulatory genome can be achieved by leveraging publicly available data from extensively studied organisms and making use of the conserved DNA binding preferences of transcription factors within the same tissue. In this study, we introduced DeepSATA, a novel deep learning-based sequence analyzer that incorporates the transcription factor binding affinity for the cross-species prediction of chromatin accessibility. By applying DeepSATA to analyze the genomes of pigs, chickens, cattle, humans, and mice, we demonstrated its ability to improve the prediction accuracy of chromatin accessibility and achieve reliable cross-species predictions in animals. Additionally, we showcased its effectiveness in analyzing pig genetic variants associated with economic traits and in increasing the accuracy of genomic predictions. Overall, our study presents a valuable tool to explore the epigenomic landscape of various species and pinpoint regulatory deoxyribonucleic acid (DNA) variants associated with complex traits.

Funder

National Key R&D Program of China

China National Key R&D Program during the 14th Five-year Plan Period

National Natural Science Foundation of China

Publisher

MDPI AG

Subject

Inorganic Chemistry,Organic Chemistry,Physical and Theoretical Chemistry,Computer Science Applications,Spectroscopy,Molecular Biology,General Medicine,Catalysis

Link

https://www.mdpi.com/1422-0067/24/15/12023/pdf

Reference46 articles.

1. Genome-wide association studies;Uffelmann;Nat. Rev. Methods Primers,2021

2. Systematic localization of common disease-associated variation in regulatory DNA;Maurano;Science,2012

3. SLE non-coding genetic risk variant determines the epigenetic dysfunction of an immune cell specific enhancer that controls disease-critical microRNA expression;Hou;Nat. Commun.,2021

4. Non-coding variability at the APOE locus contributes to the Alzheimer’s risk;Zhou;Nat. Commun.,2019

5. Integration of multi-omics data reveals cis-regulatory variants that are associated with phenotypic differentiation of eastern from western pigs;Liu;Genet. Sel. Evol.,2022

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. AI-Assisted Rational Design and Activity Prediction of Biological Elements for Optimizing Transcription-Factor-Based Biosensors;Molecules;2024-07-26