Abstract
Notwithstanding the huge amount of detailed information available in protein databases, it is not possible to automatically download a list of proteins ordered by the position of their codifying gene. This order becomes crucial when analyzing common features of proteins produced by loci or other specific regions of human chromosomes. In this study, we developed a new procedure that interrogates two human databases (genomic and protein) and produces a novel dataset of ordered proteins following the mapping of the corresponding genes. We validated and implemented the procedure to create a user-friendly web application. This novel data mining was used to evaluate the distribution of critical amino acid content in proteins codified by a human chromosome. For this purpose, we designed a new methodological approach called chromosome walking, which scanned the whole chromosome and found the regions producing proteins enriched in a selected amino acid. As an example of biomedical application, we investigated the human chromosome 15, which contains the locus DYX1 linked to developmental dyslexia, and we found three additional putative gene clusters whose expression could be driven by the environmental availability of glutamate. The novel data mining procedure and analysis could be exploited in the study of several human pathologies.
Subject
Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献