Affiliation:
1. College of Intelligence and Computing, Tianjin University
2. School of Life Sciences, North China University of Science and Technology
Abstract
Abstract
With the development of high-throughput sequencing technology, the genomic sequences increased exponentially over the last decade. In order to decode these new genomic data, machine learning methods were introduced for genome annotation and analysis. Due to the requirement of most machines learning methods, the biological sequences must be represented as fixed-length digital vectors. In this representation procedure, the physicochemical properties of k-tuple nucleotides are important information. However, the values of the physicochemical properties of k-tuple nucleotides are scattered in different resources. To facilitate the studies on genomic sequences, we developed the first comprehensive database, namely KNIndex (https://knindex.pufengdu.org), for depositing and visualizing physicochemical properties of k-tuple nucleotides. Currently, the KNIndex database contains 182 properties including one for mononucleotide (DNA), 169 for dinucleotide (147 for DNA and 22 for RNA) and 12 for trinucleotide (DNA). KNIndex database also provides a user-friendly web-based interface for the users to browse, query, visualize and download the physicochemical properties of k-tuple nucleotides. With the built-in conversion and visualization functions, users are allowed to display DNA/RNA sequences as curves of multiple physicochemical properties. We wish that the KNIndex will facilitate the related studies in computational biology.
Funder
National Natural Science Foundation of China
National Key Research and Development Program of China
Natural Science Foundation for Distinguished Young Scholar of Hebei Province
Institute of Computing Technology, Chinese Academy of Sciences
Publisher
Oxford University Press (OUP)
Subject
Molecular Biology,Information Systems
Cited by
11 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献