Article: Approximate string matching using compressed suffix arrays

File Download Links for fulltext
(May Require Subscription)
Supplementary
  • Basic View
  • Metadata View
  • XML View
TitleApproximate string matching using compressed suffix arrays
AuthorsHuynh, TND2
Hon, WK2
Lam, TW1
Sung, WK1
Issue Date2004
PublisherSpringer Verlag. The Journal's web site is located at http://springerlink.com/content/105633/
CitationLecture Notes In Computer Science (Including Subseries Lecture Notes In Artificial Intelligence And Lecture Notes In Bioinformatics), 2004, v. 3109, p. 434-444 [How to Cite?]
AbstractLet T be a text of length n and P be a pattern of length m, both strings over a fixed finite alphabet A. The k-difference (k-mismatch, respectively) problem is to find all occurrences of P in T that have edit distance (Hamming distance, respectively) at most k from P. In this paper we investigate a well-studied case in which k = 1 and T is fixed and preprocessed into an indexing data structure so that any pattern query can be answered faster [16-19]. This paper gives a solution using O(n) bits indexing data structure with O(m log 2 n) query time. To the best of our knowledge, this is the first result which requires linear indexing space. The results can be extended for the fc-difference problem with k ≥ 1. © Springer-Verlag 2004.
ISSN0302-9743
2011 SCImago Journal Rankings: 0.034
ReferencesReferences in Scopus
DC Field
Value
dc.contributor.authorHuynh, TND
dc.contributor.authorHon, WK
dc.contributor.authorLam, TW
dc.contributor.authorSung, WK
dc.date.accessioned2010-09-06T09:52:48Z
dc.date.available2010-09-06T09:52:48Z
dc.date.issued2004
dc.description.abstractLet T be a text of length n and P be a pattern of length m, both strings over a fixed finite alphabet A. The k-difference (k-mismatch, respectively) problem is to find all occurrences of P in T that have edit distance (Hamming distance, respectively) at most k from P. In this paper we investigate a well-studied case in which k = 1 and T is fixed and preprocessed into an indexing data structure so that any pattern query can be answered faster [16-19]. This paper gives a solution using O(n) bits indexing data structure with O(m log 2 n) query time. To the best of our knowledge, this is the first result which requires linear indexing space. The results can be extended for the fc-difference problem with k ≥ 1. © Springer-Verlag 2004.
dc.description.natureLink_to_subscribed_fulltext
dc.identifier.citationLecture Notes In Computer Science (Including Subseries Lecture Notes In Artificial Intelligence And Lecture Notes In Bioinformatics), 2004, v. 3109, p. 434-444 [How to Cite?]
dc.identifier.epage444
dc.identifier.hkuros130777
dc.identifier.issn0302-9743
2011 SCImago Journal Rankings: 0.034
dc.identifier.openurl
dc.identifier.scopuseid_2-s2.0-35048848454
dc.identifier.spage434
dc.identifier.urihttp://hdl.handle.net/10722/89134
dc.identifier.volume3109
dc.languageeng
dc.publisherSpringer Verlag. The Journal's web site is located at http://springerlink.com/content/105633/
dc.publisher.placeGermany
dc.relation.ispartofLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
dc.relation.referencesReferences in Scopus
dc.rightsTheoretical Computer Science. Copyright © Elsevier BV.
dc.titleApproximate string matching using compressed suffix arrays
dc.typeArticle
Author Affiliations
  1. The University of Hong Kong
  2. National University of Singapore