Abstract
AbstractA gene, a locatable region of genomic sequence, is the basic functional unit of heredity. Differences in genes lead to the various congenital physical conditions of people. One kind of these major differences are caused by genetic variations named single nucleotide polymorphisms(SNPs). SNPs may affect splice sites, protein structures and so on, and then cause gene abnormities. Some abnormities will lead to fatal diseases. People with these diseases have a small probability of having children. Thus the distributions of SNP patterns on these sites will be different with distributions on other sites. Based on this idea, we present a novel statistical method to detect the abnormal distributions of SNP patterns and then to locate the suspicious lethal genes. We did the test on HapMap data and found 74 suspicious SNPs. Among them, 10 SNPs can map reviewed genes in NCBI database. 5 genes out of them relate to fatal children diseases or embryonic development, 1 gene can cause spermatogenic failure, the other 4 genes are also associated with many genetic diseases. The results validate our idea. The method is very simple and is guaranteed by a statistical test. It is a cheap way to discover the suspicious pathogenic genes and the mutation site. The mined genes deserve further study.Author summaryXiaojun Ding received the BS, MS and PhD degrees in computer science from Central South University. Now he is a assistant professor in Yulin Normal University. His research interests include computational biology and machine learning.
Publisher
Cold Spring Harbor Laboratory