File Download
Links for fulltext
(May Require Subscription)
- Publisher Website: 10.14778/3137628.3137644
- Scopus: eid_2-s2.0-85037047212
- WOS: WOS:000416492900016
- Find via
Supplementary
- Citations:
- Appears in Collections:
Article: Forward Scan based Plane Sweep Algorithm for Parallel Interval Joins
Title | Forward Scan based Plane Sweep Algorithm for Parallel Interval Joins |
---|---|
Authors | |
Issue Date | 2017 |
Publisher | Very Large Data Base (VLDB) Endowment Inc. The Journal's web site is located at http://vldb.org/pvldb/index.html |
Citation | Proceedings of the VLDB Endowment (PVLDB), 2017, v. 10, p. 1346-1357 How to Cite? |
Abstract | The interval join is a basic operation that finds application in temporal, spatial, and uncertain databases. Although a number of centralized and distributed algorithms have been proposed for the efficient
evaluation of interval joins, classic plane sweep approaches have not been considered at their full potential. A recent piece of related work proposes an optimized approach based on plane sweep
(PS) for modern hardware, showing that it greatly outperforms previous work. However, this approach depends on the development of a complex data structure and its parallelization has not been adequately
studied. In this paper, we explore the applicability of a largely ignored forward scan (FS) based plane sweep algorithm, which is extremely simple to implement. We propose two optimizations of FS that greatly reduce its cost, making it competitive to the state-of-the-art single-threaded PS algorithm while achieving a lower memory footprint. In addition, we show the drawbacks of a previously proposed hash-based partitioning approach for parallel join processing and suggest a domain-based partitioning approach that does not produce duplicate results. Within our approach we propose a novel breakdown of the partition join jobs into a small number of independent mini-join jobs with varying cost and manage
to avoid redundant comparisons. Finally, we show how these mini-joins can be scheduled in multiple CPU cores and propose an adaptive domain partitioning, aiming at load balancing. We include an experimental study that demonstrates the efficiency of our optimized FS and the scalability of our parallelization framework. |
Persistent Identifier | http://hdl.handle.net/10722/245814 |
ISSN | 2023 Impact Factor: 2.6 2023 SCImago Journal Rankings: 2.666 |
ISI Accession Number ID |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Bouros, P | - |
dc.contributor.author | Mamoulis, N | - |
dc.date.accessioned | 2017-09-18T02:17:22Z | - |
dc.date.available | 2017-09-18T02:17:22Z | - |
dc.date.issued | 2017 | - |
dc.identifier.citation | Proceedings of the VLDB Endowment (PVLDB), 2017, v. 10, p. 1346-1357 | - |
dc.identifier.issn | 2150-8097 | - |
dc.identifier.uri | http://hdl.handle.net/10722/245814 | - |
dc.description.abstract | The interval join is a basic operation that finds application in temporal, spatial, and uncertain databases. Although a number of centralized and distributed algorithms have been proposed for the efficient evaluation of interval joins, classic plane sweep approaches have not been considered at their full potential. A recent piece of related work proposes an optimized approach based on plane sweep (PS) for modern hardware, showing that it greatly outperforms previous work. However, this approach depends on the development of a complex data structure and its parallelization has not been adequately studied. In this paper, we explore the applicability of a largely ignored forward scan (FS) based plane sweep algorithm, which is extremely simple to implement. We propose two optimizations of FS that greatly reduce its cost, making it competitive to the state-of-the-art single-threaded PS algorithm while achieving a lower memory footprint. In addition, we show the drawbacks of a previously proposed hash-based partitioning approach for parallel join processing and suggest a domain-based partitioning approach that does not produce duplicate results. Within our approach we propose a novel breakdown of the partition join jobs into a small number of independent mini-join jobs with varying cost and manage to avoid redundant comparisons. Finally, we show how these mini-joins can be scheduled in multiple CPU cores and propose an adaptive domain partitioning, aiming at load balancing. We include an experimental study that demonstrates the efficiency of our optimized FS and the scalability of our parallelization framework. | - |
dc.language | eng | - |
dc.publisher | Very Large Data Base (VLDB) Endowment Inc. The Journal's web site is located at http://vldb.org/pvldb/index.html | - |
dc.relation.ispartof | Proceedings of the VLDB Endowment (PVLDB) | - |
dc.rights | Proceedings of the VLDB Endowment (PVLDB). Copyright © Very Large Data Base (VLDB) Endowment Inc. | - |
dc.title | Forward Scan based Plane Sweep Algorithm for Parallel Interval Joins | - |
dc.type | Article | - |
dc.identifier.email | Mamoulis, N: nikos@cs.hku.hk | - |
dc.identifier.authority | Mamoulis, N=rp00155 | - |
dc.description.nature | published_or_final_version | - |
dc.identifier.doi | 10.14778/3137628.3137644 | - |
dc.identifier.scopus | eid_2-s2.0-85037047212 | - |
dc.identifier.hkuros | 276647 | - |
dc.identifier.volume | 10 | - |
dc.identifier.spage | 1346 | - |
dc.identifier.epage | 1357 | - |
dc.identifier.isi | WOS:000416492900016 | - |
dc.publisher.place | United States | - |
dc.identifier.issnl | 2150-8097 | - |