File Download
There are no files associated with this item.
Links for fulltext
(May Require Subscription)
- Publisher Website: 10.1016/j.jpdc.2020.02.003
- Scopus: eid_2-s2.0-85079824306
- WOS: WOS:000520948700007
- Find via
Supplementary
- Citations:
- Appears in Collections:
Article: On-GPU thread-data remapping for nested branch divergence
Title | On-GPU thread-data remapping for nested branch divergence |
---|---|
Authors | |
Keywords | GPGPU Branch divergence SIMD Race condition |
Issue Date | 2020 |
Publisher | Academic Press. The Journal's web site is located at http://www.elsevier.com/locate/jpdc |
Citation | Journal of Parallel and Distributed Computing, 2020, v. 139, p. 75-86 How to Cite? |
Abstract | Nested branches are common in applications with decision trees. The more layers in the branch nest, the larger slowdown is caused by nested branch divergence on GPU. Since inner branches are impractical to evaluate on host end, thread-data remapping via GPU shared memory is so far the most suitable solution. However, existing solution cannot handle inner branches directly due to undefined behavior of GPU barrier function when executed within branch statements. Race condition needs to be prevented without using barrier function. Targeting nested divergence, we propose NeX as a nested extension scheme featuring an inter-thread protocol that supports sub-workgroup synchronization. We further exploit the on-the-fly nature of Head-or-Tail (HoT) algorithm and propose HoT2 with enhanced flexibility of wavefront scheduling. Evaluated on four GPU models including NVIDIA Volta and Turing, HoT2 confirms to be more efficient. For benchmarks with branch nests up to five-layer-deep, NeX further boosts performance by up to 1.56x. |
Persistent Identifier | http://hdl.handle.net/10722/283321 |
ISSN | 2023 Impact Factor: 3.4 2023 SCImago Journal Rankings: 1.187 |
ISI Accession Number ID |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | LIN, H | - |
dc.contributor.author | Wang, CL | - |
dc.date.accessioned | 2020-06-22T02:55:00Z | - |
dc.date.available | 2020-06-22T02:55:00Z | - |
dc.date.issued | 2020 | - |
dc.identifier.citation | Journal of Parallel and Distributed Computing, 2020, v. 139, p. 75-86 | - |
dc.identifier.issn | 0743-7315 | - |
dc.identifier.uri | http://hdl.handle.net/10722/283321 | - |
dc.description.abstract | Nested branches are common in applications with decision trees. The more layers in the branch nest, the larger slowdown is caused by nested branch divergence on GPU. Since inner branches are impractical to evaluate on host end, thread-data remapping via GPU shared memory is so far the most suitable solution. However, existing solution cannot handle inner branches directly due to undefined behavior of GPU barrier function when executed within branch statements. Race condition needs to be prevented without using barrier function. Targeting nested divergence, we propose NeX as a nested extension scheme featuring an inter-thread protocol that supports sub-workgroup synchronization. We further exploit the on-the-fly nature of Head-or-Tail (HoT) algorithm and propose HoT2 with enhanced flexibility of wavefront scheduling. Evaluated on four GPU models including NVIDIA Volta and Turing, HoT2 confirms to be more efficient. For benchmarks with branch nests up to five-layer-deep, NeX further boosts performance by up to 1.56x. | - |
dc.language | eng | - |
dc.publisher | Academic Press. The Journal's web site is located at http://www.elsevier.com/locate/jpdc | - |
dc.relation.ispartof | Journal of Parallel and Distributed Computing | - |
dc.subject | GPGPU | - |
dc.subject | Branch divergence | - |
dc.subject | SIMD | - |
dc.subject | Race condition | - |
dc.title | On-GPU thread-data remapping for nested branch divergence | - |
dc.type | Article | - |
dc.identifier.email | Wang, CL: clwang@cs.hku.hk | - |
dc.identifier.authority | Wang, CL=rp00183 | - |
dc.description.nature | link_to_subscribed_fulltext | - |
dc.identifier.doi | 10.1016/j.jpdc.2020.02.003 | - |
dc.identifier.scopus | eid_2-s2.0-85079824306 | - |
dc.identifier.hkuros | 310355 | - |
dc.identifier.volume | 139 | - |
dc.identifier.spage | 75 | - |
dc.identifier.epage | 86 | - |
dc.identifier.isi | WOS:000520948700007 | - |
dc.publisher.place | United States | - |
dc.identifier.issnl | 0743-7315 | - |