Affiliation:
1. Department of Computer Science, University of Puerto Rico at Arecibo, Arecibo 00612, Puerto Rico
Abstract
Long Interspersed Element-1 (LINE-1 or L1) is an autonomous transposable element that accounts for 17% of the human genome. Strong correlations between abnormal L1 expression and diseases, particularly cancer, have been documented by numerous studies. L1PD (LINE-1 Pattern Detection) had been previously created to detect L1s by using a fixed pre-determined set of 50-mer probes and a pattern-matching algorithm. L1PD uses a novel seed-and-pattern-match strategy as opposed to the well-known seed-and-extend strategy employed by other tools. This study discusses an improved version of L1PD that shows how increasing the size of the k-mer probes from 50 to 75 or to 100 yields better results, as evidenced by experiments showing higher precision and recall when compared to the 50-mers. The probe-generation process was updated and the corresponding software is now shared so that users may generate probes for other reference genomes (with certain limitations). Additionally, L1PD was applied to other non-human genomes, such as dogs, horses, and cows, to further validate the pattern-matching strategy. The improved version of L1PD proves to be an efficient and promising approach for L1 detection.
Funder
University of Puerto Rico at Arecibo
Reference31 articles.
1. LINE dancing in the human genome: Transposable elements and disease;Belancio;Genome Med.,2009
2. Mobile elements in the human genome: Implications for disease;Solyom;Genome Med.,2012
3. Zhang, X., Zhang, R., and Yu, J. (2020). New Understanding of the Relevant Role of LINE-1 Retrotransposition in Human Disease and Immune Modulation. Front. Cell Dev. Biol., 8.
4. The impact of L1 retrotransposons on the human genome;Kazazian;Nat. Genet.,1998
5. Roles for retrotransposon insertions in human disease;Hancks;Mob. DNA,2016