File Download
There are no files associated with this item.
Links for fulltext
(May Require Subscription)
- Publisher Website: 10.1007/978-3-319-19929-0_21
- Scopus: eid_2-s2.0-84949008915
- Find via
Supplementary
-
Citations:
- Scopus: 0
- Appears in Collections:
Conference Paper: Dictionary matching with uneven gaps
Title | Dictionary matching with uneven gaps |
---|---|
Authors | |
Keywords | Dictionary matching Point enclosure queries |
Issue Date | 2015 |
Publisher | Springer Verlag. The Journal's web site is located at http://springerlink.com/content/105633/ |
Citation | The 26th Annual Symposium on Combinatorial Pattern Matching (CPM 2015), Ischia Island, Italy, 29 June-1 July 2015. In Lecture Notes in Computer Science, 2015, v. 9133, p. 247-260 How to Cite? |
Abstract | A gap-pattern is a sequence of sub-patterns separated by bounded sequences of don’t care characters (called gaps). A one-gap-pattern is a pattern of the form P[α,β]Q , where P and Q are strings drawn from alphabet Σ and [α,β] are lower and upper bounds on the gap size g . The gap size g is the number of don’t care characters between P and Q . The dictionary matching problem with one-gap is to index a collection of one-gap-patterns, so as to identify all sub-strings of a query text T that match with any one-gap-pattern in the collection. Let D be such a collection of d patterns, where D={P i [α i ,β i ]Q i ∣1≤i≤d} . Let n=∑ d i=1 |P i |+|Q i | . Let γ and λ be two parameters defined on D as follows: γ=|{j∣j∈[α i ,β i ],1≤i≤d}| and λ=|{α i ,β i ∣1≤i≤d}| . Specifically γ is the total number gap lengths possible over all patterns in D and λ is the number of distinct gap boundaries across all the patterns. We present a linear space solution (i.e., O(n) words) for answering a dictionary matching query on D in time O(|T|γlogλlogd+occ) , where occ is the output size. The query time can be improved to O(|T|γ+occ) using O(n+d 1+ϵ ) space, where ϵ>0 is an arbitrarily small constant. Additionally, we show a compact/succinct space index offering a space-time trade-off. In the special case where parameters α i and β i ’s for all the patterns are same, our results improve upon the work by Amir et al. [CPM, 2014]. We also explore several related cases where gaps can occur at arbitrary locations and where gap can be induced in the text rather than pattern. |
Description | LNCS v. 9133 entitled: Combinatorial Pattern Matching: 26th Annual Symposium, CPM 2015 ... Proceedings |
Persistent Identifier | http://hdl.handle.net/10722/214759 |
ISBN | |
ISSN | 2023 SCImago Journal Rankings: 0.606 |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Hon, WK | - |
dc.contributor.author | Lam, TW | - |
dc.contributor.author | Shah, R | - |
dc.contributor.author | Thankachan, SV | - |
dc.contributor.author | Ting, HF | - |
dc.contributor.author | Yang, Y | - |
dc.date.accessioned | 2015-08-21T11:54:24Z | - |
dc.date.available | 2015-08-21T11:54:24Z | - |
dc.date.issued | 2015 | - |
dc.identifier.citation | The 26th Annual Symposium on Combinatorial Pattern Matching (CPM 2015), Ischia Island, Italy, 29 June-1 July 2015. In Lecture Notes in Computer Science, 2015, v. 9133, p. 247-260 | - |
dc.identifier.isbn | 978-3-319-19928-3 | - |
dc.identifier.issn | 0302-9743 | - |
dc.identifier.uri | http://hdl.handle.net/10722/214759 | - |
dc.description | LNCS v. 9133 entitled: Combinatorial Pattern Matching: 26th Annual Symposium, CPM 2015 ... Proceedings | - |
dc.description.abstract | A gap-pattern is a sequence of sub-patterns separated by bounded sequences of don’t care characters (called gaps). A one-gap-pattern is a pattern of the form P[α,β]Q , where P and Q are strings drawn from alphabet Σ and [α,β] are lower and upper bounds on the gap size g . The gap size g is the number of don’t care characters between P and Q . The dictionary matching problem with one-gap is to index a collection of one-gap-patterns, so as to identify all sub-strings of a query text T that match with any one-gap-pattern in the collection. Let D be such a collection of d patterns, where D={P i [α i ,β i ]Q i ∣1≤i≤d} . Let n=∑ d i=1 |P i |+|Q i | . Let γ and λ be two parameters defined on D as follows: γ=|{j∣j∈[α i ,β i ],1≤i≤d}| and λ=|{α i ,β i ∣1≤i≤d}| . Specifically γ is the total number gap lengths possible over all patterns in D and λ is the number of distinct gap boundaries across all the patterns. We present a linear space solution (i.e., O(n) words) for answering a dictionary matching query on D in time O(|T|γlogλlogd+occ) , where occ is the output size. The query time can be improved to O(|T|γ+occ) using O(n+d 1+ϵ ) space, where ϵ>0 is an arbitrarily small constant. Additionally, we show a compact/succinct space index offering a space-time trade-off. In the special case where parameters α i and β i ’s for all the patterns are same, our results improve upon the work by Amir et al. [CPM, 2014]. We also explore several related cases where gaps can occur at arbitrary locations and where gap can be induced in the text rather than pattern. | - |
dc.language | eng | - |
dc.publisher | Springer Verlag. The Journal's web site is located at http://springerlink.com/content/105633/ | - |
dc.relation.ispartof | Lecture Notes in Computer Science | - |
dc.rights | The final publication is available at Springer via http://dx.doi.org/[insert DOI] | - |
dc.subject | Dictionary matching | - |
dc.subject | Point enclosure queries | - |
dc.title | Dictionary matching with uneven gaps | - |
dc.type | Conference_Paper | - |
dc.identifier.email | Lam, TW: twlam@cs.hku.hk | - |
dc.identifier.email | Ting, HF: hfting@cs.hku.hk | - |
dc.identifier.authority | Lam, TW=rp00135 | - |
dc.identifier.authority | Ting, HF=rp00177 | - |
dc.identifier.doi | 10.1007/978-3-319-19929-0_21 | - |
dc.identifier.scopus | eid_2-s2.0-84949008915 | - |
dc.identifier.hkuros | 249628 | - |
dc.identifier.volume | 9133 | - |
dc.identifier.spage | 247 | - |
dc.identifier.epage | 260 | - |
dc.publisher.place | Germany | - |
dc.customcontrol.immutable | sml 150929 | - |
dc.identifier.issnl | 0302-9743 | - |