File Download
 
Links for fulltext
(May Require Subscription)
 
Supplementary

Article: Compressed indexes for approximate string matching
  • Basic View
  • Metadata View
  • XML View
TitleCompressed indexes for approximate string matching
 
AuthorsChan, HL1
Lam, TW1
Sung, WK2
Tam, SL1
Wong, SS2
 
KeywordsApproximate String Matching
Compressed Index
 
Issue Date2010
 
PublisherSpringer New York LLC. The Journal's web site is located at http://link.springer.de/link/service/journals/00453/index.htm
 
CitationAlgorithmica (New York), 2010, v. 58 n. 2, p. 263-281 [How to Cite?]
DOI: http://dx.doi.org/10.1007/s00453-008-9263-2
 
AbstractWe revisit the problem of indexing a string S[1..n] to support finding all substrings in S that match a given pattern P[1..m] with at most k errors. Previous solutions either require an index of size exponential in k or need Ω(m k ) time for searching. Motivated by the indexing of DNA, we investigate space efficient indexes that occupy only O(n) space. For k=1, we give an index to support matching in O(m+occ+log∈nlog∈log∈n) time. The previously best solution achieving this time complexity requires an index of O(nlog∈n) space. This new index can also be used to improve existing indexes for k≥2 errors. Among others, it can support 2-error matching in O(mlog∈nlog∈log∈n+occ) time, and k-error matching, for any k>2, in O(m k-1log∈nlog∈log∈n+occ) time. © 2008 Springer Science+Business Media, LLC.
 
ISSN0178-4617
2013 Impact Factor: 0.567
 
DOIhttp://dx.doi.org/10.1007/s00453-008-9263-2
 
ISI Accession Number IDWOS:000279681700003
Funding AgencyGrant Number
Hong Kong RGCHKU 7140/06E
Funding Information:

The research T.-W. Lam was supported by the Hong Kong RGC Grant HKU 7140/06E.

 
ReferencesReferences in Scopus
 
GrantsCompressed indexes for approximate string matching, with applications to biological sequences
 
DC FieldValue
dc.contributor.authorChan, HL
 
dc.contributor.authorLam, TW
 
dc.contributor.authorSung, WK
 
dc.contributor.authorTam, SL
 
dc.contributor.authorWong, SS
 
dc.date.accessioned2012-06-26T06:39:05Z
 
dc.date.available2012-06-26T06:39:05Z
 
dc.date.issued2010
 
dc.description.abstractWe revisit the problem of indexing a string S[1..n] to support finding all substrings in S that match a given pattern P[1..m] with at most k errors. Previous solutions either require an index of size exponential in k or need Ω(m k ) time for searching. Motivated by the indexing of DNA, we investigate space efficient indexes that occupy only O(n) space. For k=1, we give an index to support matching in O(m+occ+log∈nlog∈log∈n) time. The previously best solution achieving this time complexity requires an index of O(nlog∈n) space. This new index can also be used to improve existing indexes for k≥2 errors. Among others, it can support 2-error matching in O(mlog∈nlog∈log∈n+occ) time, and k-error matching, for any k>2, in O(m k-1log∈nlog∈log∈n+occ) time. © 2008 Springer Science+Business Media, LLC.
 
dc.description.naturepostprint
 
dc.identifier.citationAlgorithmica (New York), 2010, v. 58 n. 2, p. 263-281 [How to Cite?]
DOI: http://dx.doi.org/10.1007/s00453-008-9263-2
 
dc.identifier.citeulike3811405
 
dc.identifier.doihttp://dx.doi.org/10.1007/s00453-008-9263-2
 
dc.identifier.epage281
 
dc.identifier.hkuros177358
 
dc.identifier.isiWOS:000279681700003
Funding AgencyGrant Number
Hong Kong RGCHKU 7140/06E
Funding Information:

The research T.-W. Lam was supported by the Hong Kong RGC Grant HKU 7140/06E.

 
dc.identifier.issn0178-4617
2013 Impact Factor: 0.567
 
dc.identifier.issue2
 
dc.identifier.scopuseid_2-s2.0-77955655806
 
dc.identifier.spage263
 
dc.identifier.urihttp://hdl.handle.net/10722/152439
 
dc.identifier.volume58
 
dc.languageeng
 
dc.publisherSpringer New York LLC. The Journal's web site is located at http://link.springer.de/link/service/journals/00453/index.htm
 
dc.publisher.placeUnited States
 
dc.relation.ispartofAlgorithmica (New York)
 
dc.relation.projectCompressed indexes for approximate string matching, with applications to biological sequences
 
dc.relation.referencesReferences in Scopus
 
dc.rightsThe original publication is available at www.springerlink.com
 
dc.rightsCreative Commons: Attribution 3.0 Hong Kong License
 
dc.subjectApproximate String Matching
 
dc.subjectCompressed Index
 
dc.titleCompressed indexes for approximate string matching
 
dc.typeArticle
 
<?xml encoding="utf-8" version="1.0"?>
<item><contributor.author>Chan, HL</contributor.author>
<contributor.author>Lam, TW</contributor.author>
<contributor.author>Sung, WK</contributor.author>
<contributor.author>Tam, SL</contributor.author>
<contributor.author>Wong, SS</contributor.author>
<date.accessioned>2012-06-26T06:39:05Z</date.accessioned>
<date.available>2012-06-26T06:39:05Z</date.available>
<date.issued>2010</date.issued>
<identifier.citation>Algorithmica (New York), 2010, v. 58 n. 2, p. 263-281</identifier.citation>
<identifier.issn>0178-4617</identifier.issn>
<identifier.uri>http://hdl.handle.net/10722/152439</identifier.uri>
<description.abstract>We revisit the problem of indexing a string S[1..n] to support finding all substrings in S that match a given pattern P[1..m] with at most k errors. Previous solutions either require an index of size exponential in k or need &#937;(m k ) time for searching. Motivated by the indexing of DNA, we investigate space efficient indexes that occupy only O(n) space. For k=1, we give an index to support matching in O(m+occ+log&#8712;nlog&#8712;log&#8712;n) time. The previously best solution achieving this time complexity requires an index of O(nlog&#8712;n) space. This new index can also be used to improve existing indexes for k&#8805;2 errors. Among others, it can support 2-error matching in O(mlog&#8712;nlog&#8712;log&#8712;n+occ) time, and k-error matching, for any k&gt;2, in O(m k-1log&#8712;nlog&#8712;log&#8712;n+occ) time. &#169; 2008 Springer Science+Business Media, LLC.</description.abstract>
<language>eng</language>
<publisher>Springer New York LLC. The Journal&apos;s web site is located at http://link.springer.de/link/service/journals/00453/index.htm</publisher>
<relation.ispartof>Algorithmica (New York)</relation.ispartof>
<rights>The original publication is available at www.springerlink.com</rights>
<rights>Creative Commons: Attribution 3.0 Hong Kong License</rights>
<subject>Approximate String Matching</subject>
<subject>Compressed Index</subject>
<title>Compressed indexes for approximate string matching</title>
<type>Article</type>
<description.nature>postprint</description.nature>
<identifier.doi>10.1007/s00453-008-9263-2</identifier.doi>
<identifier.scopus>eid_2-s2.0-77955655806</identifier.scopus>
<identifier.hkuros>177358</identifier.hkuros>
<relation.references>http://www.scopus.com/mlt/select.url?eid=2-s2.0-77955655806&amp;selection=ref&amp;src=s&amp;origin=recordpage</relation.references>
<identifier.volume>58</identifier.volume>
<identifier.issue>2</identifier.issue>
<identifier.spage>263</identifier.spage>
<identifier.epage>281</identifier.epage>
<identifier.isi>WOS:000279681700003</identifier.isi>
<publisher.place>United States</publisher.place>
<relation.project>Compressed indexes for approximate string matching, with applications to biological sequences</relation.project>
<identifier.citeulike>3811405</identifier.citeulike>
<bitstream.url>http://hub.hku.hk/bitstream/10722/152439/1/content.pdf</bitstream.url>
</item>
Author Affiliations
  1. The University of Hong Kong
  2. National University of Singapore