File Download
  Links for fulltext
     (May Require Subscription)
Supplementary

Conference Paper: Compressed index for dictionary matching

TitleCompressed index for dictionary matching
Authors
KeywordsComputers
Information science and information theory and abstracting and indexing services
Issue Date2008
PublisherIEEE, Computer Society.
Citation
Data Compression Conference Proceedings, 2008, p. 23-32 How to Cite?
AbstractThe past few years have witnessed several exciting results on compressed representation of a string T that supports efficient pattern matching, and the space complexity has been reduced to |T|Hk(T) +o(|T| log σ) bits [8, 10], where Hk(T) denotes the kth-order empirical entropy of T, and a is the size of the alphabet. In this paper we study compressed representation for another classical problem of string indexing, which is called dictionary matching in the literature. Precisely, a collection V of strings (called patterns) of total length n is to be indexed so that given a text T, the occurrences of the patterns in T can be found efficiently. In this paper we show how to exploit a sampling technique to compress the existing O(n)-word index to an (nHk(D) + o(n log σ))-bit index with only a small sacrifice in search time. © 2008 IEEE.
Persistent Identifierhttp://hdl.handle.net/10722/57233
ISSN
2023 SCImago Journal Rankings: 0.371
References

 

DC FieldValueLanguage
dc.contributor.authorHon, WKen_HK
dc.contributor.authorShah, Ren_HK
dc.contributor.authorLam, TWen_HK
dc.contributor.authorTam, SLen_HK
dc.contributor.authorVitter, JSen_HK
dc.date.accessioned2010-04-12T01:30:12Z-
dc.date.available2010-04-12T01:30:12Z-
dc.date.issued2008en_HK
dc.identifier.citationData Compression Conference Proceedings, 2008, p. 23-32en_HK
dc.identifier.issn1068-0314en_HK
dc.identifier.urihttp://hdl.handle.net/10722/57233-
dc.description.abstractThe past few years have witnessed several exciting results on compressed representation of a string T that supports efficient pattern matching, and the space complexity has been reduced to |T|Hk(T) +o(|T| log σ) bits [8, 10], where Hk(T) denotes the kth-order empirical entropy of T, and a is the size of the alphabet. In this paper we study compressed representation for another classical problem of string indexing, which is called dictionary matching in the literature. Precisely, a collection V of strings (called patterns) of total length n is to be indexed so that given a text T, the occurrences of the patterns in T can be found efficiently. In this paper we show how to exploit a sampling technique to compress the existing O(n)-word index to an (nHk(D) + o(n log σ))-bit index with only a small sacrifice in search time. © 2008 IEEE.en_HK
dc.languageengen_HK
dc.publisherIEEE, Computer Society.en_HK
dc.relation.ispartofData Compression Conference Proceedingsen_HK
dc.rights©2008 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.-
dc.subjectComputersen_HK
dc.subjectInformation science and information theory and abstracting and indexing servicesen_HK
dc.titleCompressed index for dictionary matchingen_HK
dc.typeConference_Paperen_HK
dc.identifier.emailLam, TW:twlam@cs.hku.hken_HK
dc.identifier.authorityLam, TW=rp00135en_HK
dc.description.naturepublished_or_final_versionen_HK
dc.identifier.doi10.1109/DCC.2008.62en_HK
dc.identifier.scopuseid_2-s2.0-50249188924en_HK
dc.identifier.hkuros146737-
dc.relation.referenceshttp://www.scopus.com/mlt/select.url?eid=2-s2.0-50249188924&selection=ref&src=s&origin=recordpageen_HK
dc.identifier.spage23en_HK
dc.identifier.epage32en_HK
dc.identifier.scopusauthoridHon, WK=7004282818en_HK
dc.identifier.scopusauthoridShah, R=35365088300en_HK
dc.identifier.scopusauthoridLam, TW=7202523165en_HK
dc.identifier.scopusauthoridTam, SL=14042926200en_HK
dc.identifier.scopusauthoridVitter, JS=7005508549en_HK
dc.identifier.issnl1068-0314-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats