Affiliation:
1. University of Waterloo, Waterloo, Ont., Canada
2. University of Western Ontario, London, Ont., Canada
3. City University of Hong Kong, Kowloon, Hong Kong, China
Abstract
The problem of finding a center string that is "close" to every
given string arises in computational molecular biology and coding
theory. This problem has two versions: the Closest String problem
and the Closest Substring problem. Given a set of strings
S
= {
s
1
,
s
2
, ...,
s
n
}, each of length
m
, the Closest String
problem is to find the smallest
d
and a string s of length
m
which is within Hamming distance d to each
s
i
ε
S
. This problem comes from
coding theory when we are looking for a code not too far away from
a given set of codes. Closest Substring problem, with an additional
input integer
L
, asks for the smallest d and a string
s
, of length
L
, which is within Hamming distance d
away from a substring, of length
L
, of each si. This problem
is much more elusive than the Closest String problem. The Closest
Substring problem is formulated from applications in finding
conserved regions, identifying genetic drug targets and generating
genetic probes in molecular biology. Whether there are efficient
approximation algorithms for both problems are major open questions
in this area. We present two polynomial-time approximation
algorithms with approximation ratio 1 + ε for any small
ε to settle both questions.
Publisher
Association for Computing Machinery (ACM)
Subject
Artificial Intelligence,Hardware and Architecture,Information Systems,Control and Systems Engineering,Software
Reference28 articles.
1. Design of primers for PCR amplification of highly variable genomes;Dopazo J.;CABIOS,1993
2. On covering problems of codes;Frances M.;Theoret. Comput. Syst.,1997
Cited by
153 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献