File Download
There are no files associated with this item.
Links for fulltext
(May Require Subscription)
 Publisher Website: 10.1007/9783319199290_21
 Find via
Supplementary

Citations:
 Appears in Collections:
Conference Paper: Dictionary matching with uneven gaps
Title  Dictionary matching with uneven gaps 

Authors  
Keywords  Dictionary matching Point enclosure queries 
Issue Date  2015 
Publisher  Springer Verlag. The Journal's web site is located at http://springerlink.com/content/105633/ 
Citation  The 26th Annual Symposium on Combinatorial Pattern Matching (CPM 2015), Ischia Island, Italy, 29 June1 July 2015. In Lecture Notes in Computer Science, 2015, v. 9133, p. 247260 How to Cite? 
Abstract  A gappattern is a sequence of subpatterns separated by bounded sequences of don’t care characters (called gaps). A onegappattern is a pattern of the form P[α,β]Q , where P and Q are strings drawn from alphabet Σ and [α,β] are lower and upper bounds on the gap size g . The gap size g is the number of don’t care characters between P and Q . The dictionary matching problem with onegap is to index a collection of onegappatterns, so as to identify all substrings of a query text T that match with any onegappattern in the collection. Let D be such a collection of d patterns, where D={P i [α i ,β i ]Q i ∣1≤i≤d} . Let n=∑ d i=1 P i +Q i  . Let γ and λ be two parameters defined on D as follows: γ={j∣j∈[α i ,β i ],1≤i≤d} and λ={α i ,β i ∣1≤i≤d} . Specifically γ is the total number gap lengths possible over all patterns in D and λ is the number of distinct gap boundaries across all the patterns. We present a linear space solution (i.e., O(n) words) for answering a dictionary matching query on D in time O(Tγlogλlogd+occ) , where occ is the output size. The query time can be improved to O(Tγ+occ) using O(n+d 1+ϵ ) space, where ϵ>0 is an arbitrarily small constant. Additionally, we show a compact/succinct space index offering a spacetime tradeoff. In the special case where parameters α i and β i ’s for all the patterns are same, our results improve upon the work by Amir et al. [CPM, 2014]. We also explore several related cases where gaps can occur at arbitrary locations and where gap can be induced in the text rather than pattern. 
Description  LNCS v. 9133 entitled: Combinatorial Pattern Matching: 26th Annual Symposium, CPM 2015 ... Proceedings 
Persistent Identifier  http://hdl.handle.net/10722/214759 
ISBN  
ISSN  2005 Impact Factor: 0.402 2015 SCImago Journal Rankings: 0.252 
DC Field  Value  Language 

dc.contributor.author  Hon, WK   
dc.contributor.author  Lam, TW   
dc.contributor.author  Shah, R   
dc.contributor.author  Thankachan, SV   
dc.contributor.author  Ting, HF   
dc.contributor.author  Yang, Y   
dc.date.accessioned  20150821T11:54:24Z   
dc.date.available  20150821T11:54:24Z   
dc.date.issued  2015   
dc.identifier.citation  The 26th Annual Symposium on Combinatorial Pattern Matching (CPM 2015), Ischia Island, Italy, 29 June1 July 2015. In Lecture Notes in Computer Science, 2015, v. 9133, p. 247260   
dc.identifier.isbn  9783319199283   
dc.identifier.issn  03029743   
dc.identifier.uri  http://hdl.handle.net/10722/214759   
dc.description  LNCS v. 9133 entitled: Combinatorial Pattern Matching: 26th Annual Symposium, CPM 2015 ... Proceedings   
dc.description.abstract  A gappattern is a sequence of subpatterns separated by bounded sequences of don’t care characters (called gaps). A onegappattern is a pattern of the form P[α,β]Q , where P and Q are strings drawn from alphabet Σ and [α,β] are lower and upper bounds on the gap size g . The gap size g is the number of don’t care characters between P and Q . The dictionary matching problem with onegap is to index a collection of onegappatterns, so as to identify all substrings of a query text T that match with any onegappattern in the collection. Let D be such a collection of d patterns, where D={P i [α i ,β i ]Q i ∣1≤i≤d} . Let n=∑ d i=1 P i +Q i  . Let γ and λ be two parameters defined on D as follows: γ={j∣j∈[α i ,β i ],1≤i≤d} and λ={α i ,β i ∣1≤i≤d} . Specifically γ is the total number gap lengths possible over all patterns in D and λ is the number of distinct gap boundaries across all the patterns. We present a linear space solution (i.e., O(n) words) for answering a dictionary matching query on D in time O(Tγlogλlogd+occ) , where occ is the output size. The query time can be improved to O(Tγ+occ) using O(n+d 1+ϵ ) space, where ϵ>0 is an arbitrarily small constant. Additionally, we show a compact/succinct space index offering a spacetime tradeoff. In the special case where parameters α i and β i ’s for all the patterns are same, our results improve upon the work by Amir et al. [CPM, 2014]. We also explore several related cases where gaps can occur at arbitrary locations and where gap can be induced in the text rather than pattern.   
dc.language  eng   
dc.publisher  Springer Verlag. The Journal's web site is located at http://springerlink.com/content/105633/   
dc.relation.ispartof  Lecture Notes in Computer Science   
dc.rights  The final publication is available at Springer via http://dx.doi.org/[insert DOI]   
dc.subject  Dictionary matching   
dc.subject  Point enclosure queries   
dc.title  Dictionary matching with uneven gaps   
dc.type  Conference_Paper   
dc.identifier.email  Lam, TW: twlam@cs.hku.hk   
dc.identifier.email  Ting, HF: hfting@cs.hku.hk   
dc.identifier.authority  Lam, TW=rp00135   
dc.identifier.authority  Ting, HF=rp00177   
dc.identifier.doi  10.1007/9783319199290_21   
dc.identifier.hkuros  249628   
dc.identifier.volume  9133   
dc.identifier.spage  247   
dc.identifier.epage  260   
dc.publisher.place  Germany   
dc.customcontrol.immutable  sml 150929   