File Download
  Links for fulltext
     (May Require Subscription)
Supplementary

Conference Paper: SMapReduce: optimising resource allocation by managing working slots at runtime

TitleSMapReduce: optimising resource allocation by managing working slots at runtime
Authors
KeywordsResource Management
MapReduce
Hadoop
YARN
Performance Modeling
Issue Date2015
PublisherIEEE Computer Society. The Journal's web site is located at http://ieeexplore.ieee.org/xpl/conhome.jsp?punumber=1000530
Citation
The 29th IEEE International Parallel and Distributed Processing Symposium (IPDPS 2015), Hyderabad, India, 25-29 May 2015. In IPDPS Proceedings, 2015, p. 281-190 How to Cite?
AbstractHadoop version 1 (HadoopV1) and version 2 (YARN) manage the resources in a distributed system in different ways. HadoopV1 executes MapReduce tasks in working slots that are statically configured, YARN uses a set of task containers to encapsulate its memory and CPU resources. However, neither of them considers the runtime performance of the cluster when deciding the proper number of concurrent tasks to run on each node to achieve the optimal throughput. In order to gain higher performance, the users of Hadoop usually need to use their experience to carefully configure the resources of the cluster and the resources needed by their jobs. But as the workload is typically always changing in the cluster, rarely could such a manual configuration lead to optimized performance. In this paper, we study the MapReduce job performance in HadoopV1 and YARN with different resource configurations, and model the cluster throughput in terms of the resource capacity of the cluster. We propose SMapReduce, which can dynamically manage a proper number of concurrent tasks running on each node. SMapReduce can gain the maximum job throughput by considering the thrashing phenomenon and the balancing between map and reduce tasks. Evaluation results show that SMapReduce can yield significant performance speedup comparing to both HadoopV1 and YARN for various MapReduce workloads.
DescriptionOpen Access
Persistent Identifierhttp://hdl.handle.net/10722/219225
ISSN

 

DC FieldValueLanguage
dc.contributor.authorLiang, F-
dc.contributor.authorLau, FCM-
dc.date.accessioned2015-09-18T07:18:11Z-
dc.date.available2015-09-18T07:18:11Z-
dc.date.issued2015-
dc.identifier.citationThe 29th IEEE International Parallel and Distributed Processing Symposium (IPDPS 2015), Hyderabad, India, 25-29 May 2015. In IPDPS Proceedings, 2015, p. 281-190-
dc.identifier.issn1530-2075-
dc.identifier.urihttp://hdl.handle.net/10722/219225-
dc.descriptionOpen Access-
dc.description.abstractHadoop version 1 (HadoopV1) and version 2 (YARN) manage the resources in a distributed system in different ways. HadoopV1 executes MapReduce tasks in working slots that are statically configured, YARN uses a set of task containers to encapsulate its memory and CPU resources. However, neither of them considers the runtime performance of the cluster when deciding the proper number of concurrent tasks to run on each node to achieve the optimal throughput. In order to gain higher performance, the users of Hadoop usually need to use their experience to carefully configure the resources of the cluster and the resources needed by their jobs. But as the workload is typically always changing in the cluster, rarely could such a manual configuration lead to optimized performance. In this paper, we study the MapReduce job performance in HadoopV1 and YARN with different resource configurations, and model the cluster throughput in terms of the resource capacity of the cluster. We propose SMapReduce, which can dynamically manage a proper number of concurrent tasks running on each node. SMapReduce can gain the maximum job throughput by considering the thrashing phenomenon and the balancing between map and reduce tasks. Evaluation results show that SMapReduce can yield significant performance speedup comparing to both HadoopV1 and YARN for various MapReduce workloads.-
dc.languageeng-
dc.publisherIEEE Computer Society. The Journal's web site is located at http://ieeexplore.ieee.org/xpl/conhome.jsp?punumber=1000530-
dc.relation.ispartofIPDPS Proceedings-
dc.rightsIPDPS Proceedings. Copyright © IEEE Computer Society.-
dc.rights©2015 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.-
dc.subjectResource Management-
dc.subjectMapReduce-
dc.subjectHadoop-
dc.subjectYARN-
dc.subjectPerformance Modeling-
dc.titleSMapReduce: optimising resource allocation by managing working slots at runtime-
dc.typeConference_Paper-
dc.identifier.emailLau, FCM: fcmlau@cs.hku.hk-
dc.identifier.authorityLau, FCM=rp00221-
dc.description.naturelink_to_OA_fulltext-
dc.identifier.doi10.1109/IPDPS.2015.17-
dc.identifier.hkuros253762-
dc.identifier.spage281-
dc.identifier.epage190-
dc.publisher.placeUnited States-
dc.customcontrol.immutablesml 151106-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats