Conference Paper: Cache-oblivious index for approximate string matching

File Download Links for fulltext
(May Require Subscription)
Supplementary

  • Basic View
  • Metadata View
  • XML View
TitleCache-oblivious index for approximate string matching
AuthorsHon, WK2
Lam, TW1
Shah, R3
Tam, SL1
Vitter, JS3
Issue Date2007
PublisherSpringer Verlag. The Journal's web site is located at http://springerlink.com/content/105633/
CitationLecture Notes In Computer Science (Including Subseries Lecture Notes In Artificial Intelligence And Lecture Notes In Bioinformatics), 2007, v. 4580 LNCS, p. 40-51 [How to Cite?]
AbstractThis paper revisits the problem of indexing a text for approximate string matching. Specifically, given a text T of length n and a positive integer k, we want to construct an index of T such that for any input pattern P, we can find all its k-error matches in T efficiently. This problem is well-studied in the internal-memory setting. Here, we extend some of these recent results to external-memory solutions, which are also cache-oblivious. Our first index occupies O((n logk n)/B) disk pages and finds all k-error matches with O((|P| + occ)/B + logk n log logB n) I/Os, where B denotes the number of words in a disk page. To the best of our knowledge, this index is the first external-memory data structure that does not require Ω(|P| + occ + poly(log n)) I/Os. The second index reduces the space to O{{n log n)/B) disk pages, and the I/O complexity is O((|P| + occ)/B + log k(k+1) n log log n). © Springer-Verlag Berlin Heidelberg 2007.
ISSN0302-9743
2011 SCImago Journal Rankings: 0.034
ReferencesReferences in Scopus
DC Field
Value
dc.contributor.authorHon, WK
dc.contributor.authorLam, TW
dc.contributor.authorShah, R
dc.contributor.authorTam, SL
dc.contributor.authorVitter, JS
dc.date.accessioned2010-09-25T14:52:30Z
dc.date.available2010-09-25T14:52:30Z
dc.date.issued2007
dc.description.abstractThis paper revisits the problem of indexing a text for approximate string matching. Specifically, given a text T of length n and a positive integer k, we want to construct an index of T such that for any input pattern P, we can find all its k-error matches in T efficiently. This problem is well-studied in the internal-memory setting. Here, we extend some of these recent results to external-memory solutions, which are also cache-oblivious. Our first index occupies O((n logk n)/B) disk pages and finds all k-error matches with O((|P| + occ)/B + logk n log logB n) I/Os, where B denotes the number of words in a disk page. To the best of our knowledge, this index is the first external-memory data structure that does not require Ω(|P| + occ + poly(log n)) I/Os. The second index reduces the space to O{{n log n)/B) disk pages, and the I/O complexity is O((|P| + occ)/B + log k(k+1) n log log n). © Springer-Verlag Berlin Heidelberg 2007.
dc.description.natureLink_to_subscribed_fulltext
dc.identifier.citationLecture Notes In Computer Science (Including Subseries Lecture Notes In Artificial Intelligence And Lecture Notes In Bioinformatics), 2007, v. 4580 LNCS, p. 40-51 [How to Cite?]
dc.identifier.epage51
dc.identifier.hkuros128193
dc.identifier.issn0302-9743
2011 SCImago Journal Rankings: 0.034
dc.identifier.scopuseid_2-s2.0-37849007688
dc.identifier.spage40
dc.identifier.urihttp://hdl.handle.net/10722/93153
dc.identifier.volume4580 LNCS
dc.languageeng
dc.publisherSpringer Verlag. The Journal's web site is located at http://springerlink.com/content/105633/
dc.publisher.placeGermany
dc.relation.ispartofLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
dc.relation.referencesReferences in Scopus
dc.titleCache-oblivious index for approximate string matching
dc.typeConference_Paper
Author Affiliations
  1. The University of Hong Kong
  2. National Tsing Hua University
  3. Purdue University