Compressed indexes for approximate string matching

Chan, HL; Lam, TW; Sung, WK; Tam, SL; Wong, SS

File Download

re01.htm

Links for fulltext

(May Require Subscription)

Publisher Website: 10.1007/s00453-008-9263-2
Scopus: eid_2-s2.0-77955655806
WOS: WOS:000279681700003
Find via

Supplementary

Bookmarks:
- CiteULike: 1
Citations:
- Scopus: 0
- Web of Science: 0
Appears in Collections:
- Computer Science: Journal/Magazine Articles

Article: Compressed indexes for approximate string matching

Title

Compressed indexes for approximate string matching

Authors

Chan, HL Lam, TW Sung, WK Tam, SL Wong, SS

Keywords

Approximate String Matching
Compressed Index

Issue Date

2010

Publisher

Springer New York LLC. The Journal's web site is located at http://link.springer.de/link/service/journals/00453/index.htm

Citation

Algorithmica (New York), 2010, v. 58 n. 2, p. 263-281 How to Cite?

DOI: http://dx.doi.org/10.1007/s00453-008-9263-2

Abstract

We revisit the problem of indexing a string S[1..n] to support finding all substrings in S that match a given pattern P[1..m] with at most k errors. Previous solutions either require an index of size exponential in k or need Ω(m k ) time for searching. Motivated by the indexing of DNA, we investigate space efficient indexes that occupy only O(n) space. For k=1, we give an index to support matching in O(m+occ+log∈nlog∈log∈n) time. The previously best solution achieving this time complexity requires an index of O(nlog∈n) space. This new index can also be used to improve existing indexes for k≥2 errors. Among others, it can support 2-error matching in O(mlog∈nlog∈log∈n+occ) time, and k-error matching, for any k>2, in O(m k-1log∈nlog∈log∈n+occ) time. © 2008 Springer Science+Business Media, LLC.

Persistent Identifier

http://hdl.handle.net/10722/152439

ISSN

0178-4617

2023 Impact Factor: 0.9

2023 SCImago Journal Rankings: 0.905

ISI Accession Number ID

WOS:000279681700003

Funding Agency	Grant Number
Hong Kong RGC	HKU 7140/06E

Funding Information:

The research T.-W. Lam was supported by the Hong Kong RGC Grant HKU 7140/06E.

References

References in Scopus

Grants

Compressed indexes for approximate string matching, with applications to biological sequences

DC Field	Value	Language
dc.contributor.author	Chan, HL	en_US
dc.contributor.author	Lam, TW	en_US
dc.contributor.author	Sung, WK	en_US
dc.contributor.author	Tam, SL	en_US
dc.contributor.author	Wong, SS	en_US
dc.date.accessioned	2012-06-26T06:39:05Z	-
dc.date.available	2012-06-26T06:39:05Z	-
dc.date.issued	2010	en_US
dc.identifier.citation	Algorithmica (New York), 2010, v. 58 n. 2, p. 263-281	en_US
dc.identifier.issn	0178-4617	en_US
dc.identifier.uri	http://hdl.handle.net/10722/152439	-
dc.description.abstract	We revisit the problem of indexing a string S[1..n] to support finding all substrings in S that match a given pattern P[1..m] with at most k errors. Previous solutions either require an index of size exponential in k or need Ω(m k ) time for searching. Motivated by the indexing of DNA, we investigate space efficient indexes that occupy only O(n) space. For k=1, we give an index to support matching in O(m+occ+log∈nlog∈log∈n) time. The previously best solution achieving this time complexity requires an index of O(nlog∈n) space. This new index can also be used to improve existing indexes for k≥2 errors. Among others, it can support 2-error matching in O(mlog∈nlog∈log∈n+occ) time, and k-error matching, for any k>2, in O(m k-1log∈nlog∈log∈n+occ) time. © 2008 Springer Science+Business Media, LLC.	en_US
dc.language	eng	en_US
dc.publisher	Springer New York LLC. The Journal's web site is located at http://link.springer.de/link/service/journals/00453/index.htm	en_US
dc.relation.ispartof	Algorithmica (New York)	en_US
dc.rights	The original publication is available at www.springerlink.com	-
dc.subject	Approximate String Matching	en_US
dc.subject	Compressed Index	en_US
dc.title	Compressed indexes for approximate string matching	en_US
dc.type	Article	en_US
dc.identifier.email	Chan, HL:hlchan@cs.hku.hk	en_US
dc.identifier.email	Lam, TW:twlam@cs.hku.hk	en_US
dc.identifier.authority	Chan, HL=rp01310	en_US
dc.identifier.authority	Lam, TW=rp00135	en_US
dc.description.nature	link_to_subscribed_fulltext	en_US
dc.identifier.doi	10.1007/s00453-008-9263-2	en_US
dc.identifier.scopus	eid_2-s2.0-77955655806	en_US
dc.identifier.hkuros	177358	-
dc.relation.references	http://www.scopus.com/mlt/select.url?eid=2-s2.0-77955655806&selection=ref&src=s&origin=recordpage	en_US
dc.identifier.volume	58	en_US
dc.identifier.issue	2	en_US
dc.identifier.spage	263	en_US
dc.identifier.epage	281	en_US
dc.identifier.isi	WOS:000279681700003	-
dc.publisher.place	United States	en_US
dc.relation.project	Compressed indexes for approximate string matching, with applications to biological sequences	-
dc.identifier.scopusauthorid	Chan, HL=7403402384	en_US
dc.identifier.scopusauthorid	Lam, TW=7202523165	en_US
dc.identifier.scopusauthorid	Sung, WK=13310059700	en_US
dc.identifier.scopusauthorid	Tam, SL=14042926200	en_US
dc.identifier.scopusauthorid	Wong, SS=8439889300	en_US
dc.identifier.citeulike	3811405	-
dc.identifier.issnl	0178-4617	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Article: Compressed indexes for approximate string matching

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats