Author:
Jareborg Niclas,Birney Ewan,Durbin Richard
Abstract
A data set of 77 genomic mouse/human gene pairs has been compiled from the EMBL nucleotide database, and their corresponding features determined. This set was used to analyze the degree of conservation of noncoding sequences between mouse and human. A new alignment algorithm was developed to cope with the fact that large parts of noncoding sequences are not alignable in a meaningful way because of genetic drift. This new algorithm, DNA Block Aligner (DBA), finds colinear-conserved blocks that are flanked by nonconserved sequences of varying lengths. The noncoding regions of the data set were aligned with DBA. The proportion of the noncoding regions covered by blocks >60% identical was 36% for upstream regions, 50% for 5′ UTRs, 23% for introns, and 56% for 3′ UTRs. These blocks of high identity were more or less evenly distributed across the length of the features, except for upstream regions in which the first 100 bp upstream of the transcription start site was covered in up to 70% of the gene pairs. This data set complements earlier sets on the basis of cDNA sequences and will be useful for further comparative studies.[This paper contains supplementary data that can be found at http://www.genome.com.]
Publisher
Cold Spring Harbor Laboratory
Subject
Genetics(clinical),Genetics
Reference29 articles.
1. Comparative sequence analysis of a gene-rich cluster at human chromosome 12p13 and its syntenic region in mouse chromosome 6.;Ansari-Lari;Genome Res.,1998
2. Number of CpG islands and genes in human and mouse.
3. The isochore organization of the human genome and its evolutionary history — a review
4. Dynamite: A flexible code generating language for dynamic programming methods used in sequence comparison.;Birney;Intell. Syst. Mol. Biol.,1997
Cited by
168 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献