Compressed index for dictionary matching

Hon, WK; Shah, R; Lam, TW; Tam, SL; Vitter, JS

File Download

144834.pdf

Links for fulltext

(May Require Subscription)

Publisher Website: 10.1109/DCC.2008.62
Scopus: eid_2-s2.0-50249188924
Find via

Supplementary

Citations:
- Scopus: 0
Appears in Collections:
- Computer Science: Conference papers

Conference Paper: Compressed index for dictionary matching

Title	Compressed index for dictionary matching
Authors	Hon, WK Shah, R Lam, TW Tam, SL Vitter, JS
Keywords	Computers Information science and information theory and abstracting and indexing services
Issue Date	2008
Publisher	IEEE, Computer Society.
Citation	Data Compression Conference Proceedings, 2008, p. 23-32 How to Cite? DOI: http://dx.doi.org/10.1109/DCC.2008.62
Abstract	The past few years have witnessed several exciting results on compressed representation of a string T that supports efficient pattern matching, and the space complexity has been reduced to \|T\|Hk(T) +o(\|T\| log σ) bits [8, 10], where Hk(T) denotes the kth-order empirical entropy of T, and a is the size of the alphabet. In this paper we study compressed representation for another classical problem of string indexing, which is called dictionary matching in the literature. Precisely, a collection V of strings (called patterns) of total length n is to be indexed so that given a text T, the occurrences of the patterns in T can be found efficiently. In this paper we show how to exploit a sampling technique to compress the existing O(n)-word index to an (nHk(D) + o(n log σ))-bit index with only a small sacrifice in search time. © 2008 IEEE.
Persistent Identifier	http://hdl.handle.net/10722/57233
ISSN	1068-0314 2023 SCImago Journal Rankings: 0.371
References	References in Scopus

DC Field	Value	Language
dc.contributor.author	Hon, WK	en_HK
dc.contributor.author	Shah, R	en_HK
dc.contributor.author	Lam, TW	en_HK
dc.contributor.author	Tam, SL	en_HK
dc.contributor.author	Vitter, JS	en_HK
dc.date.accessioned	2010-04-12T01:30:12Z	-
dc.date.available	2010-04-12T01:30:12Z	-
dc.date.issued	2008	en_HK
dc.identifier.citation	Data Compression Conference Proceedings, 2008, p. 23-32	en_HK
dc.identifier.issn	1068-0314	en_HK
dc.identifier.uri	http://hdl.handle.net/10722/57233	-
dc.description.abstract	The past few years have witnessed several exciting results on compressed representation of a string T that supports efficient pattern matching, and the space complexity has been reduced to \|T\|Hk(T) +o(\|T\| log σ) bits [8, 10], where Hk(T) denotes the kth-order empirical entropy of T, and a is the size of the alphabet. In this paper we study compressed representation for another classical problem of string indexing, which is called dictionary matching in the literature. Precisely, a collection V of strings (called patterns) of total length n is to be indexed so that given a text T, the occurrences of the patterns in T can be found efficiently. In this paper we show how to exploit a sampling technique to compress the existing O(n)-word index to an (nHk(D) + o(n log σ))-bit index with only a small sacrifice in search time. © 2008 IEEE.	en_HK
dc.language	eng	en_HK
dc.publisher	IEEE, Computer Society.	en_HK
dc.relation.ispartof	Data Compression Conference Proceedings	en_HK
dc.rights	©2008 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.	-
dc.subject	Computers	en_HK
dc.subject	Information science and information theory and abstracting and indexing services	en_HK
dc.title	Compressed index for dictionary matching	en_HK
dc.type	Conference_Paper	en_HK
dc.identifier.email	Lam, TW:twlam@cs.hku.hk	en_HK
dc.identifier.authority	Lam, TW=rp00135	en_HK
dc.description.nature	published_or_final_version	en_HK
dc.identifier.doi	10.1109/DCC.2008.62	en_HK
dc.identifier.scopus	eid_2-s2.0-50249188924	en_HK
dc.identifier.hkuros	146737	-
dc.relation.references	http://www.scopus.com/mlt/select.url?eid=2-s2.0-50249188924&selection=ref&src=s&origin=recordpage	en_HK
dc.identifier.spage	23	en_HK
dc.identifier.epage	32	en_HK
dc.identifier.scopusauthorid	Hon, WK=7004282818	en_HK
dc.identifier.scopusauthorid	Shah, R=35365088300	en_HK
dc.identifier.scopusauthorid	Lam, TW=7202523165	en_HK
dc.identifier.scopusauthorid	Tam, SL=14042926200	en_HK
dc.identifier.scopusauthorid	Vitter, JS=7005508549	en_HK
dc.identifier.issnl	1068-0314	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Conference Paper: Compressed index for dictionary matching

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats