**Article:**Approximate string matching using compressed suffix arrays

Title | Approximate string matching using compressed suffix arrays |
---|---|

Authors | Huynh, TND2 Hon, WK1 Lam, TW1 Sung, WK2 |

Issue Date | 2006 |

Publisher | Elsevier BV. The Journal's web site is located at http://www.elsevier.com/locate/tcs |

Citation | Theoretical Computer Science, 2006, v. 352 n. 1-3, p. 240-249 [How to Cite?] DOI: http://dx.doi.org/10.1016/j.tcs.2005.11.022 |

Abstract | Let T be a text of length n and P be a pattern of length m, both strings over a fixed finite alphabet A. The k-difference (k-mismatch, respectively) problem is to find all occurrences of P in T that have edit distance (Hamming distance, respectively) at most k from P. In this paper we investigate a well-studied case in which T is fixed and preprocessed into an indexing data structure so that any pattern query can be answered faster. We give a solution using an O(nlogn) bits indexing data structure with O(|A|kmk·max(k,logn) +occ) query time, where occ is the number of occurrences. The best previous result requires O(nlogn) bits indexing data structure and gives O(|A|kmk+2+occ) query time. Our solution also allows us to exploit compressed suffix arrays to reduce the indexing space to O(n) bits, while increasing the query time by an O(logn) factor only. © 2005 Elsevier B.V. All right reserved. |

ISSN | 0304-3975 2013 Impact Factor: 0.516 2013 SCImago Journal Rankings: 0.932 |

DOI | http://dx.doi.org/10.1016/j.tcs.2005.11.022 |

ISI Accession Number ID | WOS:000235826900018 |

References | References in Scopus |

DC Field | Value |
---|---|

dc.contributor.author | Huynh, TND |

dc.contributor.author | Hon, WK |

dc.contributor.author | Lam, TW |

dc.contributor.author | Sung, WK |

dc.date.accessioned | 2012-06-26T06:37:14Z |

dc.date.available | 2012-06-26T06:37:14Z |

dc.date.issued | 2006 |

dc.description.abstract | Let T be a text of length n and P be a pattern of length m, both strings over a fixed finite alphabet A. The k-difference (k-mismatch, respectively) problem is to find all occurrences of P in T that have edit distance (Hamming distance, respectively) at most k from P. In this paper we investigate a well-studied case in which T is fixed and preprocessed into an indexing data structure so that any pattern query can be answered faster. We give a solution using an O(nlogn) bits indexing data structure with O(|A|kmk·max(k,logn) +occ) query time, where occ is the number of occurrences. The best previous result requires O(nlogn) bits indexing data structure and gives O(|A|kmk+2+occ) query time. Our solution also allows us to exploit compressed suffix arrays to reduce the indexing space to O(n) bits, while increasing the query time by an O(logn) factor only. © 2005 Elsevier B.V. All right reserved. |

dc.description.nature | Link_to_subscribed_fulltext |

dc.identifier.citation | Theoretical Computer Science, 2006, v. 352 n. 1-3, p. 240-249 [How to Cite?] DOI: http://dx.doi.org/10.1016/j.tcs.2005.11.022 |

dc.identifier.doi | http://dx.doi.org/10.1016/j.tcs.2005.11.022 |

dc.identifier.epage | 249 |

dc.identifier.isi | WOS:000235826900018 |

dc.identifier.issn | 0304-3975 2013 Impact Factor: 0.516 2013 SCImago Journal Rankings: 0.932 |

dc.identifier.issue | 1-3 |

dc.identifier.scopus | eid_2-s2.0-32644436921 |

dc.identifier.spage | 240 |

dc.identifier.uri | http://hdl.handle.net/10722/152330 |

dc.identifier.volume | 352 |

dc.language | eng |

dc.publisher | Elsevier BV. The Journal's web site is located at http://www.elsevier.com/locate/tcs |

dc.publisher.place | Netherlands |

dc.relation.ispartof | Theoretical Computer Science |

dc.relation.references | References in Scopus |

dc.title | Approximate string matching using compressed suffix arrays |

dc.type | Article |

<?xml encoding="utf-8" version="1.0"?> <item><contributor.author>Huynh, TND</contributor.author> <contributor.author>Hon, WK</contributor.author> <contributor.author>Lam, TW</contributor.author> <contributor.author>Sung, WK</contributor.author> <date.accessioned>2012-06-26T06:37:14Z</date.accessioned> <date.available>2012-06-26T06:37:14Z</date.available> <date.issued>2006</date.issued> <identifier.citation>Theoretical Computer Science, 2006, v. 352 n. 1-3, p. 240-249</identifier.citation> <identifier.issn>0304-3975</identifier.issn> <identifier.uri>http://hdl.handle.net/10722/152330</identifier.uri> <description.abstract>Let T be a text of length n and P be a pattern of length m, both strings over a fixed finite alphabet A. The k-difference (k-mismatch, respectively) problem is to find all occurrences of P in T that have edit distance (Hamming distance, respectively) at most k from P. In this paper we investigate a well-studied case in which T is fixed and preprocessed into an indexing data structure so that any pattern query can be answered faster. We give a solution using an O(nlogn) bits indexing data structure with O(|A|kmk·max(k,logn) +occ) query time, where occ is the number of occurrences. The best previous result requires O(nlogn) bits indexing data structure and gives O(|A|kmk+2+occ) query time. Our solution also allows us to exploit compressed suffix arrays to reduce the indexing space to O(n) bits, while increasing the query time by an O(logn) factor only. © 2005 Elsevier B.V. All right reserved.</description.abstract> <language>eng</language> <publisher>Elsevier BV. The Journal's web site is located at http://www.elsevier.com/locate/tcs</publisher> <relation.ispartof>Theoretical Computer Science</relation.ispartof> <title>Approximate string matching using compressed suffix arrays</title> <type>Article</type> <description.nature>Link_to_subscribed_fulltext</description.nature> <identifier.doi>10.1016/j.tcs.2005.11.022</identifier.doi> <identifier.scopus>eid_2-s2.0-32644436921</identifier.scopus> <relation.references>http://www.scopus.com/mlt/select.url?eid=2-s2.0-32644436921&selection=ref&src=s&origin=recordpage</relation.references> <identifier.volume>352</identifier.volume> <identifier.issue>1-3</identifier.issue> <identifier.spage>240</identifier.spage> <identifier.epage>249</identifier.epage> <identifier.isi>WOS:000235826900018</identifier.isi> <publisher.place>Netherlands</publisher.place> </item>

Author Affiliations

- The University of Hong Kong
- National University of Singapore