File Download
 
Links for fulltext
(May Require Subscription)
 
Supplementary

Conference Paper: Improved approximate string matching using compressed suffix data structures
  • Basic View
  • Metadata View
  • XML View
TitleImproved approximate string matching using compressed suffix data structures
 
AuthorsLam, TW1
Sung, WK2
Wong, SS2
 
Issue Date2008
 
PublisherSpringer New York LLC. The Journal's web site is located at http://link.springer.de/link/service/journals/00453/index.htm
 
CitationAlgorithmica (New York), 2008, v. 51 n. 3, p. 298-314 [How to Cite?]
DOI: http://dx.doi.org/10.1007/s00453-007-9104-8
 
AbstractApproximate string matching is about finding a given string pattern in a text by allowing some degree of errors. In this paper we present a space efficient data structure to solve the 1-mismatch and 1-difference problems. Given a text T of length n over an alphabet A, we can preprocess T and give an O(n √log n log |A|)-bit space data structure so that, for any query pattern P of length m, we can find all 1-mismatch (or 1-difference) occurrences of P in O(|A|m log log n+occ) time, where occ is the number of occurrences. This is the fastest known query time given that the space of the data structure is o(n log2 n) bits. The space of our data structure can be further reduced to O(n log |A|) with the query time increasing by a factor of log ε n, for 0 < ε ≤ 1. Furthermore, our solution can be generalized to solve the k-mismatch (and the k-difference) problem in O(|A| k mk (k + log log n) + occ) and O(logε n(|A|k mk (k + log log n) + occ)) time using an O(n √log n log |A|)-bit and an O(n log |A|)-bit indexing data structures, respectively. We assume that the alphabet size |A| is bounded by O(2 √log n) for the O(n√log n log |A|)-bit space data structure. © 2007 Springer Science+Business Media, LLC.
 
ISSN0178-4617
2013 Impact Factor: 0.567
 
DOIhttp://dx.doi.org/10.1007/s00453-007-9104-8
 
ISI Accession Number IDWOS:000255874200005
 
ReferencesReferences in Scopus
 
DC FieldValue
dc.contributor.authorLam, TW
 
dc.contributor.authorSung, WK
 
dc.contributor.authorWong, SS
 
dc.date.accessioned2012-06-26T06:30:42Z
 
dc.date.available2012-06-26T06:30:42Z
 
dc.date.issued2008
 
dc.description.abstractApproximate string matching is about finding a given string pattern in a text by allowing some degree of errors. In this paper we present a space efficient data structure to solve the 1-mismatch and 1-difference problems. Given a text T of length n over an alphabet A, we can preprocess T and give an O(n √log n log |A|)-bit space data structure so that, for any query pattern P of length m, we can find all 1-mismatch (or 1-difference) occurrences of P in O(|A|m log log n+occ) time, where occ is the number of occurrences. This is the fastest known query time given that the space of the data structure is o(n log2 n) bits. The space of our data structure can be further reduced to O(n log |A|) with the query time increasing by a factor of log ε n, for 0 < ε ≤ 1. Furthermore, our solution can be generalized to solve the k-mismatch (and the k-difference) problem in O(|A| k mk (k + log log n) + occ) and O(logε n(|A|k mk (k + log log n) + occ)) time using an O(n √log n log |A|)-bit and an O(n log |A|)-bit indexing data structures, respectively. We assume that the alphabet size |A| is bounded by O(2 √log n) for the O(n√log n log |A|)-bit space data structure. © 2007 Springer Science+Business Media, LLC.
 
dc.description.natureLink_to_subscribed_fulltext
 
dc.identifier.citationAlgorithmica (New York), 2008, v. 51 n. 3, p. 298-314 [How to Cite?]
DOI: http://dx.doi.org/10.1007/s00453-007-9104-8
 
dc.identifier.doihttp://dx.doi.org/10.1007/s00453-007-9104-8
 
dc.identifier.epage314
 
dc.identifier.isiWOS:000255874200005
 
dc.identifier.issn0178-4617
2013 Impact Factor: 0.567
 
dc.identifier.issue3
 
dc.identifier.scopuseid_2-s2.0-43949112336
 
dc.identifier.spage298
 
dc.identifier.urihttp://hdl.handle.net/10722/151910
 
dc.identifier.volume51
 
dc.languageeng
 
dc.publisherSpringer New York LLC. The Journal's web site is located at http://link.springer.de/link/service/journals/00453/index.htm
 
dc.publisher.placeUnited States
 
dc.relation.ispartofAlgorithmica (New York)
 
dc.relation.referencesReferences in Scopus
 
dc.titleImproved approximate string matching using compressed suffix data structures
 
dc.typeConference_Paper
 
<?xml encoding="utf-8" version="1.0"?>
<item><contributor.author>Lam, TW</contributor.author>
<contributor.author>Sung, WK</contributor.author>
<contributor.author>Wong, SS</contributor.author>
<date.accessioned>2012-06-26T06:30:42Z</date.accessioned>
<date.available>2012-06-26T06:30:42Z</date.available>
<date.issued>2008</date.issued>
<identifier.citation>Algorithmica (New York), 2008, v. 51 n. 3, p. 298-314</identifier.citation>
<identifier.issn>0178-4617</identifier.issn>
<identifier.uri>http://hdl.handle.net/10722/151910</identifier.uri>
<description.abstract>Approximate string matching is about finding a given string pattern in a text by allowing some degree of errors. In this paper we present a space efficient data structure to solve the 1-mismatch and 1-difference problems. Given a text T of length n over an alphabet A, we can preprocess T and give an O(n &#8730;log n log |A|)-bit space data structure so that, for any query pattern P of length m, we can find all 1-mismatch (or 1-difference) occurrences of P in O(|A|m log log n+occ) time, where occ is the number of occurrences. This is the fastest known query time given that the space of the data structure is o(n log2 n) bits. The space of our data structure can be further reduced to O(n log |A|) with the query time increasing by a factor of log &#949; n, for 0 &lt; &#949; &#8804; 1. Furthermore, our solution can be generalized to solve the k-mismatch (and the k-difference) problem in O(|A| k mk (k + log log n) + occ) and O(log&#949; n(|A|k mk (k + log log n) + occ)) time using an O(n &#8730;log n log |A|)-bit and an O(n log |A|)-bit indexing data structures, respectively. We assume that the alphabet size |A| is bounded by O(2 &#8730;log n) for the O(n&#8730;log n log |A|)-bit space data structure. &#169; 2007 Springer Science+Business Media, LLC.</description.abstract>
<language>eng</language>
<publisher>Springer New York LLC. The Journal&apos;s web site is located at http://link.springer.de/link/service/journals/00453/index.htm</publisher>
<relation.ispartof>Algorithmica (New York)</relation.ispartof>
<title>Improved approximate string matching using compressed suffix data structures</title>
<type>Conference_Paper</type>
<description.nature>Link_to_subscribed_fulltext</description.nature>
<identifier.doi>10.1007/s00453-007-9104-8</identifier.doi>
<identifier.scopus>eid_2-s2.0-43949112336</identifier.scopus>
<relation.references>http://www.scopus.com/mlt/select.url?eid=2-s2.0-43949112336&amp;selection=ref&amp;src=s&amp;origin=recordpage</relation.references>
<identifier.volume>51</identifier.volume>
<identifier.issue>3</identifier.issue>
<identifier.spage>298</identifier.spage>
<identifier.epage>314</identifier.epage>
<identifier.isi>WOS:000255874200005</identifier.isi>
<publisher.place>United States</publisher.place>
</item>
Author Affiliations
  1. The University of Hong Kong
  2. National University of Singapore