Affiliation:
1. Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo , Chiba 277-8562, Japan
Abstract
Abstract
Motivation
Over the past 30 years, extended tandem repeats (TRs) have been correlated with ∼60 diseases with high odds ratios, and most known TRs consist of single repeat units. However, in the last few years, mosaic TRs composed of different units have been found to be associated with several brain disorders by long-read sequencing techniques. Mosaic TRs are difficult-to-characterize sequence configurations that are usually confirmed by manual inspection. Widely used tools are not designed to solve the mosaic TR problem and often fail to properly decompose mosaic TRs.
Results
We propose an efficient algorithm that can decompose mosaic TRs in the input string with high sensitivity. Using synthetic benchmark data, we demonstrate that our program named uTR outperforms TRF and RepeatMasker in terms of prediction accuracy, this is especially true when mosaic TRs are more complex, and uTR is faster than TRF and RepeatMasker in most cases.
Availability and implementation
The software program uTR that implements the proposed algorithm is available at https://github.com/morisUtokyo/uTR.
Funder
Japan Agency for Medical Research and Development
Publisher
Oxford University Press (OUP)
Subject
Computational Mathematics,Computational Theory and Mathematics,Computer Science Applications,Molecular Biology,Biochemistry,Statistics and Probability
Cited by
4 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献