File Download
  Links for fulltext
     (May Require Subscription)
Supplementary

Article: Resource scaling effects on MPP performance: The STAP benchmark implications

TitleResource scaling effects on MPP performance: The STAP benchmark implications
Authors
Issue Date1999
PublisherI E E E. The Journal's web site is located at http://www.computer.org/tpds
Citation
Ieee Transactions On Parallel And Distributed Systems, 1999, v. 10 n. 5, p. 509-527 How to Cite?
AbstractPresently, massively parallel processors (MPPs) are available only in a few commercial models. A sequence of three ASCI Teraflops MPPs has appeared before the new millenium. This paper evaluates six MPP systems through STAP benchmark experiments. The STAP is a radar signal processing benchmark which exploits regularly structured SPMD data parallelism. We reveal the resource scaling effects on MPP performance along orthogonal dimensions of machine size, processor speed, memory capacity, messaging latency, and network bandwidth. We show how to achieve balanced resources scaling against enlarged workload (problem size). Among three commercial MPPs, the IBM SP2 shows the highest speed and efficiency, attributed to its well-designed network with middleware support for single system image. The Cray T3D demonstrates a high network bandwidth with a good NUMA memory hierarchy. The Intel Paragon trails far behind due to slow processors used and excessive latency experienced in passing messages. Our analysis projects the lowest STAP speed on the ASCI Red, compared with the projected speed of two ASCI Blue machines. This is attributed to slow processors used in ASCI Red and the mismatch between its hardware and software. The Blue Pacific shows the highest potential to deliver scalable performance up to thousands of nodes. The Blue Mountain is designed to have the highest network bandwidth. Our results suggest a limit on the scalability of the distributed shared-memory (DSM) architecture adopted in Blue Mountain. The scaling model offers a quantitative method to match resource scaling with problem scaling to yield a truly scalable performance. The model helps MPP designers optimize the processors, memory, network, and I/O subsystems of an MPP. For MPP users, the scaling results can be applied to partition a large workload for SPMD execution or to minimize the software overhead in collective communication or remote memory update operations. Finally, our scaling model is assessed to evaluate MPPs with benchmarks other than STAP.
Persistent Identifierhttp://hdl.handle.net/10722/42822
ISSN
2023 Impact Factor: 5.6
2023 SCImago Journal Rankings: 2.340
ISI Accession Number ID
References

 

DC FieldValueLanguage
dc.contributor.authorHwang, Ken_HK
dc.contributor.authorWang, Cen_HK
dc.contributor.authorWang, CLen_HK
dc.contributor.authorXu, Zen_HK
dc.date.accessioned2007-03-23T04:32:50Z-
dc.date.available2007-03-23T04:32:50Z-
dc.date.issued1999en_HK
dc.identifier.citationIeee Transactions On Parallel And Distributed Systems, 1999, v. 10 n. 5, p. 509-527en_HK
dc.identifier.issn1045-9219en_HK
dc.identifier.urihttp://hdl.handle.net/10722/42822-
dc.description.abstractPresently, massively parallel processors (MPPs) are available only in a few commercial models. A sequence of three ASCI Teraflops MPPs has appeared before the new millenium. This paper evaluates six MPP systems through STAP benchmark experiments. The STAP is a radar signal processing benchmark which exploits regularly structured SPMD data parallelism. We reveal the resource scaling effects on MPP performance along orthogonal dimensions of machine size, processor speed, memory capacity, messaging latency, and network bandwidth. We show how to achieve balanced resources scaling against enlarged workload (problem size). Among three commercial MPPs, the IBM SP2 shows the highest speed and efficiency, attributed to its well-designed network with middleware support for single system image. The Cray T3D demonstrates a high network bandwidth with a good NUMA memory hierarchy. The Intel Paragon trails far behind due to slow processors used and excessive latency experienced in passing messages. Our analysis projects the lowest STAP speed on the ASCI Red, compared with the projected speed of two ASCI Blue machines. This is attributed to slow processors used in ASCI Red and the mismatch between its hardware and software. The Blue Pacific shows the highest potential to deliver scalable performance up to thousands of nodes. The Blue Mountain is designed to have the highest network bandwidth. Our results suggest a limit on the scalability of the distributed shared-memory (DSM) architecture adopted in Blue Mountain. The scaling model offers a quantitative method to match resource scaling with problem scaling to yield a truly scalable performance. The model helps MPP designers optimize the processors, memory, network, and I/O subsystems of an MPP. For MPP users, the scaling results can be applied to partition a large workload for SPMD execution or to minimize the software overhead in collective communication or remote memory update operations. Finally, our scaling model is assessed to evaluate MPPs with benchmarks other than STAP.en_HK
dc.format.extent995649 bytes-
dc.format.extent25600 bytes-
dc.format.mimetypeapplication/pdf-
dc.format.mimetypeapplication/msword-
dc.languageengen_HK
dc.publisherI E E E. The Journal's web site is located at http://www.computer.org/tpdsen_HK
dc.relation.ispartofIEEE Transactions on Parallel and Distributed Systemsen_HK
dc.rights©1999 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.-
dc.titleResource scaling effects on MPP performance: The STAP benchmark implicationsen_HK
dc.typeArticleen_HK
dc.identifier.openurlhttp://library.hku.hk:4550/resserv?sid=HKU:IR&issn=1045-9219&volume=10&issue=5&spage=509&epage=527&date=1999&atitle=Resource+scaling+effects+on+MPP+performance:+the+STAP+benchmark+implicationsen_HK
dc.identifier.emailWang, CL:clwang@cs.hku.hken_HK
dc.identifier.authorityWang, CL=rp00183en_HK
dc.description.naturepublished_or_final_versionen_HK
dc.identifier.doi10.1109/71.770197en_HK
dc.identifier.scopuseid_2-s2.0-0032638174en_HK
dc.identifier.hkuros46234-
dc.relation.referenceshttp://www.scopus.com/mlt/select.url?eid=2-s2.0-0032638174&selection=ref&src=s&origin=recordpageen_HK
dc.identifier.volume10en_HK
dc.identifier.issue5en_HK
dc.identifier.spage509en_HK
dc.identifier.epage527en_HK
dc.identifier.isiWOS:000080635600007-
dc.publisher.placeUnited Statesen_HK
dc.identifier.scopusauthoridHwang, K=7402426691en_HK
dc.identifier.scopusauthoridWang, C=7501630962en_HK
dc.identifier.scopusauthoridWang, CL=7501646188en_HK
dc.identifier.scopusauthoridXu, Z=7405426306en_HK
dc.identifier.issnl1045-9219-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats