Sparse Dynamic Programming on DAGs with Small Width

Author:

Mäkinen Veli1ORCID,Tomescu Alexandru I.1ORCID,Kuosmanen Anna1ORCID,Paavilainen Topi2,Gagie Travis3,Chikhi Rayan4ORCID

Affiliation:

1. Helsinki Institute for Information Technology, Department of Computer Science, University of Helsinki, Helsinki, Finland

2. Helsinki Institute for Information Technology, Department of Computer Science, University of Helsinki, Finland

3. EIT, Diego Portales University, Santiago, Chile

4. CNRS, CRIStAL, University of Lille, France

Abstract

The minimum path cover problem asks us to find a minimum-cardinality set of paths that cover all the nodes of a directed acyclic graph (DAG). We study the case when the size k of a minimum path cover is small, that is, when the DAG has a small width . This case is motivated by applications in pan-genomics , where the genomic variation of a population is expressed as a DAG. We observe that classical alignment algorithms exploiting sparse dynamic programming can be extended to the sequence-against-DAG case by mimicking the algorithm for sequences on each path of a minimum path cover and handling an evaluation order anomaly with reachability queries . Namely, we introduce a general framework for DAG-extensions of sparse dynamic programming. This framework produces algorithms that are slower than their counterparts on sequences only by a factor k . We illustrate this on two classical problems extended to DAGs: longest increasing subsequence and longest common subsequence . For the former, we obtain an algorithm with running time O ( k | E |log | V |). This matches the optimal solution to the classical problem variant when the input sequence is modeled as a path. We obtain an analogous result for the longest common subsequence problem. We then apply this technique to the co-linear chaining problem, which is a generalization of the above two problems. The algorithm for this problem turns out to be more involved, needing further ingredients, such as an FM-index tailored for large alphabets and a two-dimensional range search tree modified to support range maximum queries. We also study a general sequence-to-DAG alignment formulation that allows affine gap costs in the sequence. The main ingredient of the proposed framework is a new algorithm for finding a minimum path cover of a DAG ( V , E ) in O ( k | E |log | V |) time, improving all known time-bounds when k is small and the DAG is not too dense. In addition to boosting the sparse dynamic programming framework, an immediate consequence of this new minimum path cover algorithm is an improved space/time tradeoff for reachability queries in arbitrary directed graphs.

Funder

Academy of Finland

Fondecyt

Publisher

Association for Computing Machinery (ACM)

Subject

Mathematics (miscellaneous)

Cited by 20 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. Max-Min Diversification with Asymmetric Distances;Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining;2024-08-24

2. Maximum-scoring path sets on pangenome graphs of constant treewidth;Frontiers in Bioinformatics;2024-07-01

3. Label-guided seed-chain-extend alignment on annotated De Bruijn graphs;Bioinformatics;2024-06-28

4. Co-linear chaining on pangenome graphs;Algorithms for Molecular Biology;2024-01-27

5. Elastic founder graphs improved and enhanced;Theoretical Computer Science;2024-01

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3