File Download
  Links for fulltext
     (May Require Subscription)
Supplementary

Article: Cache-oblivious index for approximate string matching

TitleCache-oblivious index for approximate string matching
Authors
KeywordsApproximate queries
Cache-oblivious
I/O model
Indexing
String matching
Issue Date2011
PublisherElsevier BV. The Journal's web site is located at http://www.elsevier.com/locate/tcs
Citation
Theoretical Computer Science, 2011, v. 412 n. 29, p. 3579-3588 How to Cite?
AbstractThis paper revisits the problem of indexing a text for approximate string matching. Specifically, given a text T of length n and a positive integer k, we want to construct an index of T such that for any input pattern P, we can find all its k-error matches in T efficiently. This problem is well-studied in the internal-memory setting. Here, we extend some of these recent results to external-memory solutions, which are also cache-oblivious. Our first index occupies O((nlog kn)B) disk pages and finds all k-error matches with O((|P|+occ)B+log knloglog Bn) I/Os, where B denotes the number of words in a disk page. To the best of our knowledge, this index is the first external-memory data structure that does not require Ω (|P|+occ+poly(logn)) I/Os. The second index reduces the space to O((nlogn)B) disk pages, and the I/O complexity is O((|P|+occ)B+log k(k+1)nloglogn) . © 2011 Elsevier B.V. All rights reserved.
Persistent Identifierhttp://hdl.handle.net/10722/140789
ISSN
2014 Impact Factor: 0.657
2014 SCImago Journal Rankings: 1.096
ISI Accession Number ID
Funding AgencyGrant Number
Taiwan NSC96-2221-E-007-082
99-2221-E-007-123
Hong Kong RGC7140/06E
US NSFCCF-0621457
CCF-1017623
Funding Information:

A preliminary version appears in Proceedings of 18th Annual Symposium on Combinatorial Pattern Matching (CPM), pages 40-51, 2007. This research is supported in part by Taiwan NSC Grants 96-2221-E-007-082 and 99-2221-E-007-123 (Wing-Kai Hon), Hong Kong RGC Grant 7140/06E (Tak-Wah Lam), and US NSF Grants CCF-0621457 and CCF-1017623 (Rahul Shah and Jeffrey Scott Vitter).

References

 

DC FieldValueLanguage
dc.contributor.authorHon, WKen_HK
dc.contributor.authorLam, TWen_HK
dc.contributor.authorShah, Ren_HK
dc.contributor.authorTam, SLen_HK
dc.contributor.authorVitter, JSen_HK
dc.date.accessioned2011-09-23T06:19:20Z-
dc.date.available2011-09-23T06:19:20Z-
dc.date.issued2011en_HK
dc.identifier.citationTheoretical Computer Science, 2011, v. 412 n. 29, p. 3579-3588en_HK
dc.identifier.issn0304-3975en_HK
dc.identifier.urihttp://hdl.handle.net/10722/140789-
dc.description.abstractThis paper revisits the problem of indexing a text for approximate string matching. Specifically, given a text T of length n and a positive integer k, we want to construct an index of T such that for any input pattern P, we can find all its k-error matches in T efficiently. This problem is well-studied in the internal-memory setting. Here, we extend some of these recent results to external-memory solutions, which are also cache-oblivious. Our first index occupies O((nlog kn)B) disk pages and finds all k-error matches with O((|P|+occ)B+log knloglog Bn) I/Os, where B denotes the number of words in a disk page. To the best of our knowledge, this index is the first external-memory data structure that does not require Ω (|P|+occ+poly(logn)) I/Os. The second index reduces the space to O((nlogn)B) disk pages, and the I/O complexity is O((|P|+occ)B+log k(k+1)nloglogn) . © 2011 Elsevier B.V. All rights reserved.en_HK
dc.languageengen_US
dc.publisherElsevier BV. The Journal's web site is located at http://www.elsevier.com/locate/tcsen_HK
dc.relation.ispartofTheoretical Computer Scienceen_HK
dc.rightsNOTICE: this is the author’s version of a work that was accepted for publication in Theoretical Computer Science. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in Theoretical Computer Science, 2011, v. 412 n. 29, p. 3579-3588. DOI: http://dx.doi.org/10.1016/j.tcs.2011.03.004-
dc.rightsCreative Commons: Attribution 3.0 Hong Kong License-
dc.subjectApproximate queriesen_HK
dc.subjectCache-obliviousen_HK
dc.subjectI/O modelen_HK
dc.subjectIndexingen_HK
dc.subjectString matchingen_HK
dc.titleCache-oblivious index for approximate string matchingen_HK
dc.typeArticleen_HK
dc.identifier.openurlhttp://library.hku.hk:4550/resserv?sid=HKU:IR&issn=0304-3975&volume=412&issue=29&spage=3579&epage=3588&date=2011&atitle=Cache-oblivious+index+for+approximate+string+matching-
dc.identifier.emailLam, TW:twlam@cs.hku.hken_HK
dc.identifier.authorityLam, TW=rp00135en_HK
dc.description.naturepostprint-
dc.identifier.doi10.1016/j.tcs.2011.03.004en_HK
dc.identifier.scopuseid_2-s2.0-79956348246en_HK
dc.identifier.hkuros192200en_US
dc.relation.referenceshttp://www.scopus.com/mlt/select.url?eid=2-s2.0-79956348246&selection=ref&src=s&origin=recordpageen_HK
dc.identifier.volume412en_HK
dc.identifier.issue29en_HK
dc.identifier.spage3579en_HK
dc.identifier.epage3588en_HK
dc.identifier.isiWOS:000292077200015-
dc.publisher.placeNetherlandsen_HK
dc.identifier.scopusauthoridHon, WK=7004282818en_HK
dc.identifier.scopusauthoridLam, TW=7202523165en_HK
dc.identifier.scopusauthoridShah, R=35365088300en_HK
dc.identifier.scopusauthoridTam, SL=14042926200en_HK
dc.identifier.scopusauthoridVitter, JS=7005508549en_HK
dc.identifier.citeulike9193876-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats