Article: Approximate string matching using compressed suffix arrays
| Title | Approximate string matching using compressed suffix arrays |
|---|---|
| Authors | Huynh, TND2 Hon, WK1 Lam, TW1 Sung, WK2 |
| Issue Date | 2006 |
| Publisher | Elsevier BV. The Journal's web site is located at http://www.elsevier.com/locate/tcs |
| Citation | Theoretical Computer Science, 2006, v. 352 n. 1-3, p. 240-249 [How to Cite?] DOI: http://dx.doi.org/10.1016/j.tcs.2005.11.022 |
| Abstract | Let T be a text of length n and P be a pattern of length m, both strings over a fixed finite alphabet A. The k-difference (k-mismatch, respectively) problem is to find all occurrences of P in T that have edit distance (Hamming distance, respectively) at most k from P. In this paper we investigate a well-studied case in which T is fixed and preprocessed into an indexing data structure so that any pattern query can be answered faster. We give a solution using an O(nlogn) bits indexing data structure with O(|A|kmk·max(k,logn) +occ) query time, where occ is the number of occurrences. The best previous result requires O(nlogn) bits indexing data structure and gives O(|A|kmk+2+occ) query time. Our solution also allows us to exploit compressed suffix arrays to reduce the indexing space to O(n) bits, while increasing the query time by an O(logn) factor only. © 2005 Elsevier B.V. All right reserved. |
| ISSN | 0304-3975 2011 Impact Factor: 0.665 2011 SCImago Journal Rankings: 0.055 |
| DOI | http://dx.doi.org/10.1016/j.tcs.2005.11.022 |
| References | References in Scopus |
| dc.contributor.author | Huynh, TND |
|---|---|
| dc.contributor.author | Hon, WK |
| dc.contributor.author | Lam, TW |
| dc.contributor.author | Sung, WK |
| dc.date.accessioned | 2012-06-26T06:37:14Z |
| dc.date.available | 2012-06-26T06:37:14Z |
| dc.date.issued | 2006 |
| dc.description.abstract | Let T be a text of length n and P be a pattern of length m, both strings over a fixed finite alphabet A. The k-difference (k-mismatch, respectively) problem is to find all occurrences of P in T that have edit distance (Hamming distance, respectively) at most k from P. In this paper we investigate a well-studied case in which T is fixed and preprocessed into an indexing data structure so that any pattern query can be answered faster. We give a solution using an O(nlogn) bits indexing data structure with O(|A|kmk·max(k,logn) +occ) query time, where occ is the number of occurrences. The best previous result requires O(nlogn) bits indexing data structure and gives O(|A|kmk+2+occ) query time. Our solution also allows us to exploit compressed suffix arrays to reduce the indexing space to O(n) bits, while increasing the query time by an O(logn) factor only. © 2005 Elsevier B.V. All right reserved. |
| dc.description.nature | Link_to_subscribed_fulltext |
| dc.identifier.citation | Theoretical Computer Science, 2006, v. 352 n. 1-3, p. 240-249 [How to Cite?] DOI: http://dx.doi.org/10.1016/j.tcs.2005.11.022 |
| dc.identifier.doi | http://dx.doi.org/10.1016/j.tcs.2005.11.022 |
| dc.identifier.epage | 249 |
| dc.identifier.isi | WOS:000235826900018 |
| dc.identifier.issn | 0304-3975 2011 Impact Factor: 0.665 2011 SCImago Journal Rankings: 0.055 |
| dc.identifier.issue | 1-3 |
| dc.identifier.scopus | eid_2-s2.0-32644436921 |
| dc.identifier.spage | 240 |
| dc.identifier.uri | http://hdl.handle.net/10722/152330 |
| dc.identifier.volume | 352 |
| dc.language | eng |
| dc.publisher | Elsevier BV. The Journal's web site is located at http://www.elsevier.com/locate/tcs |
| dc.publisher.place | Netherlands |
| dc.relation.ispartof | Theoretical Computer Science |
| dc.relation.references | References in Scopus |
| dc.title | Approximate string matching using compressed suffix arrays |
| dc.type | Article |
Author Affiliations
- The University of Hong Kong
- National University of Singapore

