Improved approximate string matching using compressed suffix data structures

Lam, TW; Sung, WK; Wong, SS

File Download

There are no files associated with this item.

Links for fulltext

(May Require Subscription)

Publisher Website: 10.1007/s00453-007-9104-8
Scopus: eid_2-s2.0-43949112336
WOS: WOS:000255874200005
Find via

Supplementary

Citations:
- Scopus: 0
- Web of Science: 0
Appears in Collections:
- Computer Science: Conference papers

Conference Paper: Improved approximate string matching using compressed suffix data structures

Title	Improved approximate string matching using compressed suffix data structures
Authors	Lam, TW Sung, WK Wong, SS
Issue Date	2008
Publisher	Springer New York LLC. The Journal's web site is located at http://link.springer.de/link/service/journals/00453/index.htm
Citation	Algorithmica (New York), 2008, v. 51 n. 3, p. 298-314 How to Cite? DOI: http://dx.doi.org/10.1007/s00453-007-9104-8
Abstract	Approximate string matching is about finding a given string pattern in a text by allowing some degree of errors. In this paper we present a space efficient data structure to solve the 1-mismatch and 1-difference problems. Given a text T of length n over an alphabet A, we can preprocess T and give an O(n √log n log \|A\|)-bit space data structure so that, for any query pattern P of length m, we can find all 1-mismatch (or 1-difference) occurrences of P in O(\|A\|m log log n+occ) time, where occ is the number of occurrences. This is the fastest known query time given that the space of the data structure is o(n log2 n) bits. The space of our data structure can be further reduced to O(n log \|A\|) with the query time increasing by a factor of log ε n, for 0 < ε ≤ 1. Furthermore, our solution can be generalized to solve the k-mismatch (and the k-difference) problem in O(\|A\| k mk (k + log log n) + occ) and O(logε n(\|A\|k mk (k + log log n) + occ)) time using an O(n √log n log \|A\|)-bit and an O(n log \|A\|)-bit indexing data structures, respectively. We assume that the alphabet size \|A\| is bounded by O(2 √log n) for the O(n√log n log \|A\|)-bit space data structure. © 2007 Springer Science+Business Media, LLC.
Persistent Identifier	http://hdl.handle.net/10722/151910
ISSN	0178-4617 2023 Impact Factor: 0.9 2023 SCImago Journal Rankings: 0.905
ISI Accession Number ID	WOS:000255874200005
References	References in Scopus

DC Field	Value	Language
dc.contributor.author	Lam, TW	en_US
dc.contributor.author	Sung, WK	en_US
dc.contributor.author	Wong, SS	en_US
dc.date.accessioned	2012-06-26T06:30:42Z	-
dc.date.available	2012-06-26T06:30:42Z	-
dc.date.issued	2008	en_US
dc.identifier.citation	Algorithmica (New York), 2008, v. 51 n. 3, p. 298-314	en_US
dc.identifier.issn	0178-4617	en_US
dc.identifier.uri	http://hdl.handle.net/10722/151910	-
dc.description.abstract	Approximate string matching is about finding a given string pattern in a text by allowing some degree of errors. In this paper we present a space efficient data structure to solve the 1-mismatch and 1-difference problems. Given a text T of length n over an alphabet A, we can preprocess T and give an O(n √log n log \|A\|)-bit space data structure so that, for any query pattern P of length m, we can find all 1-mismatch (or 1-difference) occurrences of P in O(\|A\|m log log n+occ) time, where occ is the number of occurrences. This is the fastest known query time given that the space of the data structure is o(n log2 n) bits. The space of our data structure can be further reduced to O(n log \|A\|) with the query time increasing by a factor of log ε n, for 0 < ε ≤ 1. Furthermore, our solution can be generalized to solve the k-mismatch (and the k-difference) problem in O(\|A\| k mk (k + log log n) + occ) and O(logε n(\|A\|k mk (k + log log n) + occ)) time using an O(n √log n log \|A\|)-bit and an O(n log \|A\|)-bit indexing data structures, respectively. We assume that the alphabet size \|A\| is bounded by O(2 √log n) for the O(n√log n log \|A\|)-bit space data structure. © 2007 Springer Science+Business Media, LLC.	en_US
dc.language	eng	en_US
dc.publisher	Springer New York LLC. The Journal's web site is located at http://link.springer.de/link/service/journals/00453/index.htm	en_US
dc.relation.ispartof	Algorithmica (New York)	en_US
dc.title	Improved approximate string matching using compressed suffix data structures	en_US
dc.type	Conference_Paper	en_US
dc.identifier.email	Lam, TW:twlam@cs.hku.hk	en_US
dc.identifier.authority	Lam, TW=rp00135	en_US
dc.description.nature	link_to_subscribed_fulltext	en_US
dc.identifier.doi	10.1007/s00453-007-9104-8	en_US
dc.identifier.scopus	eid_2-s2.0-43949112336	en_US
dc.relation.references	http://www.scopus.com/mlt/select.url?eid=2-s2.0-43949112336&selection=ref&src=s&origin=recordpage	en_US
dc.identifier.volume	51	en_US
dc.identifier.issue	3	en_US
dc.identifier.spage	298	en_US
dc.identifier.epage	314	en_US
dc.identifier.isi	WOS:000255874200005	-
dc.publisher.place	United States	en_US
dc.identifier.scopusauthorid	Lam, TW=7202523165	en_US
dc.identifier.scopusauthorid	Sung, WK=13310059700	en_US
dc.identifier.scopusauthorid	Wong, SS=8439889300	en_US
dc.identifier.issnl	0178-4617	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Conference Paper: Improved approximate string matching using compressed suffix data structures

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats