Conference Paper: Compressed index for dictionary matching

File Download Links for fulltext
(May Require Subscription)
Supplementary
  • Basic View
  • Metadata View
  • XML View
TitleCompressed index for dictionary matching
AuthorsHon, WK2
Shah, R4
Lam, TW1
Tam, SL1
Vitter, JS3
KeywordsComputers
Information science and information theory and abstracting and indexing services
Issue Date2008
PublisherIEEE, Computer Society.
CitationData Compression Conference Proceedings, 2008, p. 23-32 [How to Cite?]
DOI: http://dx.doi.org/10.1109/DCC.2008.62
AbstractThe past few years have witnessed several exciting results on compressed representation of a string T that supports efficient pattern matching, and the space complexity has been reduced to |T|Hk(T) +o(|T| log σ) bits [8, 10], where Hk(T) denotes the kth-order empirical entropy of T, and a is the size of the alphabet. In this paper we study compressed representation for another classical problem of string indexing, which is called dictionary matching in the literature. Precisely, a collection V of strings (called patterns) of total length n is to be indexed so that given a text T, the occurrences of the patterns in T can be found efficiently. In this paper we show how to exploit a sampling technique to compress the existing O(n)-word index to an (nHk(D) + o(n log σ))-bit index with only a small sacrifice in search time. © 2008 IEEE.
ISSN1068-0314
2011 SCImago Journal Rankings: 0.045
DOIhttp://dx.doi.org/10.1109/DCC.2008.62
ReferencesReferences in Scopus
DC Field
Value
dc.contributor.authorHon, WK
dc.contributor.authorShah, R
dc.contributor.authorLam, TW
dc.contributor.authorTam, SL
dc.contributor.authorVitter, JS
dc.date.accessioned2010-04-12T01:30:12Z
dc.date.available2010-04-12T01:30:12Z
dc.date.issued2008
dc.description.abstractThe past few years have witnessed several exciting results on compressed representation of a string T that supports efficient pattern matching, and the space complexity has been reduced to |T|Hk(T) +o(|T| log σ) bits [8, 10], where Hk(T) denotes the kth-order empirical entropy of T, and a is the size of the alphabet. In this paper we study compressed representation for another classical problem of string indexing, which is called dictionary matching in the literature. Precisely, a collection V of strings (called patterns) of total length n is to be indexed so that given a text T, the occurrences of the patterns in T can be found efficiently. In this paper we show how to exploit a sampling technique to compress the existing O(n)-word index to an (nHk(D) + o(n log σ))-bit index with only a small sacrifice in search time. © 2008 IEEE.
dc.description.naturepublished_or_final_version
dc.identifier.citationData Compression Conference Proceedings, 2008, p. 23-32 [How to Cite?]
DOI: http://dx.doi.org/10.1109/DCC.2008.62
dc.identifier.doihttp://dx.doi.org/10.1109/DCC.2008.62
dc.identifier.epage32
dc.identifier.hkuros146737
dc.identifier.issn1068-0314
2011 SCImago Journal Rankings: 0.045
dc.identifier.scopuseid_2-s2.0-50249188924
dc.identifier.spage23
dc.identifier.urihttp://hdl.handle.net/10722/57233
dc.languageeng
dc.publisherIEEE, Computer Society.
dc.relation.ispartofData Compression Conference Proceedings
dc.relation.referencesReferences in Scopus
dc.rights©2008 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.
dc.rightsCreative Commons: Attribution 3.0 Hong Kong License
dc.subjectComputers
dc.subjectInformation science and information theory and abstracting and indexing services
dc.titleCompressed index for dictionary matching
dc.typeConference_Paper
Author Affiliations
  1. The University of Hong Kong
  2. National Tsing Hua University
  3. Purdue University
  4. Louisiana State University