File Download
 
Links for fulltext
(May Require Subscription)
 
Supplementary

Conference Paper: Compressed index for dictionary matching
  • Basic View
  • Metadata View
  • XML View
TitleCompressed index for dictionary matching
 
AuthorsHon, WK2
Shah, R4
Lam, TW1
Tam, SL1
Vitter, JS3
 
KeywordsComputers
Information science and information theory and abstracting and indexing services
 
Issue Date2008
 
PublisherIEEE, Computer Society.
 
CitationData Compression Conference Proceedings, 2008, p. 23-32 [How to Cite?]
DOI: http://dx.doi.org/10.1109/DCC.2008.62
 
AbstractThe past few years have witnessed several exciting results on compressed representation of a string T that supports efficient pattern matching, and the space complexity has been reduced to |T|Hk(T) +o(|T| log σ) bits [8, 10], where Hk(T) denotes the kth-order empirical entropy of T, and a is the size of the alphabet. In this paper we study compressed representation for another classical problem of string indexing, which is called dictionary matching in the literature. Precisely, a collection V of strings (called patterns) of total length n is to be indexed so that given a text T, the occurrences of the patterns in T can be found efficiently. In this paper we show how to exploit a sampling technique to compress the existing O(n)-word index to an (nHk(D) + o(n log σ))-bit index with only a small sacrifice in search time. © 2008 IEEE.
 
ISSN1068-0314
2012 SCImago Journal Rankings: 0.489
 
DOIhttp://dx.doi.org/10.1109/DCC.2008.62
 
ReferencesReferences in Scopus
 
DC FieldValue
dc.contributor.authorHon, WK
 
dc.contributor.authorShah, R
 
dc.contributor.authorLam, TW
 
dc.contributor.authorTam, SL
 
dc.contributor.authorVitter, JS
 
dc.date.accessioned2010-04-12T01:30:12Z
 
dc.date.available2010-04-12T01:30:12Z
 
dc.date.issued2008
 
dc.description.abstractThe past few years have witnessed several exciting results on compressed representation of a string T that supports efficient pattern matching, and the space complexity has been reduced to |T|Hk(T) +o(|T| log σ) bits [8, 10], where Hk(T) denotes the kth-order empirical entropy of T, and a is the size of the alphabet. In this paper we study compressed representation for another classical problem of string indexing, which is called dictionary matching in the literature. Precisely, a collection V of strings (called patterns) of total length n is to be indexed so that given a text T, the occurrences of the patterns in T can be found efficiently. In this paper we show how to exploit a sampling technique to compress the existing O(n)-word index to an (nHk(D) + o(n log σ))-bit index with only a small sacrifice in search time. © 2008 IEEE.
 
dc.description.naturepublished_or_final_version
 
dc.identifier.citationData Compression Conference Proceedings, 2008, p. 23-32 [How to Cite?]
DOI: http://dx.doi.org/10.1109/DCC.2008.62
 
dc.identifier.doihttp://dx.doi.org/10.1109/DCC.2008.62
 
dc.identifier.epage32
 
dc.identifier.hkuros146737
 
dc.identifier.issn1068-0314
2012 SCImago Journal Rankings: 0.489
 
dc.identifier.scopuseid_2-s2.0-50249188924
 
dc.identifier.spage23
 
dc.identifier.urihttp://hdl.handle.net/10722/57233
 
dc.languageeng
 
dc.publisherIEEE, Computer Society.
 
dc.relation.ispartofData Compression Conference Proceedings
 
dc.relation.referencesReferences in Scopus
 
dc.rights©2008 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.
 
dc.rightsCreative Commons: Attribution 3.0 Hong Kong License
 
dc.subjectComputers
 
dc.subjectInformation science and information theory and abstracting and indexing services
 
dc.titleCompressed index for dictionary matching
 
dc.typeConference_Paper
 
<?xml encoding="utf-8" version="1.0"?>
<item><contributor.author>Hon, WK</contributor.author>
<contributor.author>Shah, R</contributor.author>
<contributor.author>Lam, TW</contributor.author>
<contributor.author>Tam, SL</contributor.author>
<contributor.author>Vitter, JS</contributor.author>
<date.accessioned>2010-04-12T01:30:12Z</date.accessioned>
<date.available>2010-04-12T01:30:12Z</date.available>
<date.issued>2008</date.issued>
<identifier.citation>Data Compression Conference Proceedings, 2008, p. 23-32</identifier.citation>
<identifier.issn>1068-0314</identifier.issn>
<identifier.uri>http://hdl.handle.net/10722/57233</identifier.uri>
<description.abstract>The past few years have witnessed several exciting results on compressed representation of a string T that supports efficient pattern matching, and the space complexity has been reduced to |T|Hk(T) +o(|T| log &#963;) bits [8, 10], where Hk(T) denotes the kth-order empirical entropy of T, and a is the size of the alphabet. In this paper we study compressed representation for another classical problem of string indexing, which is called dictionary matching in the literature. Precisely, a collection V of strings (called patterns) of total length n is to be indexed so that given a text T, the occurrences of the patterns in T can be found efficiently. In this paper we show how to exploit a sampling technique to compress the existing O(n)-word index to an (nHk(D) + o(n log &#963;))-bit index with only a small sacrifice in search time. &#169; 2008 IEEE.</description.abstract>
<language>eng</language>
<publisher>IEEE, Computer Society.</publisher>
<relation.ispartof>Data Compression Conference Proceedings</relation.ispartof>
<rights>&#169;2008 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.</rights>
<rights>Creative Commons: Attribution 3.0 Hong Kong License</rights>
<subject>Computers</subject>
<subject>Information science and information theory and abstracting and indexing services</subject>
<title>Compressed index for dictionary matching</title>
<type>Conference_Paper</type>
<description.nature>published_or_final_version</description.nature>
<identifier.doi>10.1109/DCC.2008.62</identifier.doi>
<identifier.scopus>eid_2-s2.0-50249188924</identifier.scopus>
<identifier.hkuros>146737</identifier.hkuros>
<relation.references>http://www.scopus.com/mlt/select.url?eid=2-s2.0-50249188924&amp;selection=ref&amp;src=s&amp;origin=recordpage</relation.references>
<identifier.spage>23</identifier.spage>
<identifier.epage>32</identifier.epage>
<bitstream.url>http://hub.hku.hk/bitstream/10722/57233/1/144834.pdf</bitstream.url>
</item>
Author Affiliations
  1. The University of Hong Kong
  2. National Tsing Hua University
  3. Purdue University
  4. Louisiana State University