File Download
There are no files associated with this item.
Links for fulltext
(May Require Subscription)
- Publisher Website: 10.1109/TPDS.2019.2955935
- Scopus: eid_2-s2.0-85075918980
- Find via
Supplementary
-
Citations:
- Scopus: 0
- Appears in Collections:
Article: Online Placement and Scaling of Geo-Distributed Machine Learning Jobs via Volume-Discounting Brokerage
Title | Online Placement and Scaling of Geo-Distributed Machine Learning Jobs via Volume-Discounting Brokerage |
---|---|
Authors | |
Keywords | Geo-distributed machine learning online placement volume discount brokerage |
Issue Date | 2020 |
Publisher | Institute of Electrical and Electronics Engineers. The Journal's web site is located at http://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=71 |
Citation | IEEE Transactions on Parallel and Distributed Systems, 2020, v. 31 n. 4, p. 948-966 How to Cite? |
Abstract | Geo-distributed machine learning (ML) often uses large geo-dispersed data collections produced over time to train global models, without consolidating the data to a central site. In the parameter server architecture, “workers” and “parameter servers” for a geo-distributed ML job should be strategically deployed and adjusted on the fly, to allow easy access to the datasets and fast exchange of the model parameters at anytime. Despite many cloud platforms now provide volume discounts to encourage the usage of their ML resources, different geo-distributed ML jobs that run in the clouds often rent cloud resources separately and respectively, thus rarely enjoying the benefit of discounts. We study an ML broker service that aggregates geo-distributed ML jobs into cloud data centers for volume discounts via dynamic online placement and scaling of workers and parameter servers in individual jobs for long-term cost minimization. To decide the number and the placement of workers and parameter servers, we propose an efficient online algorithm which first decomposes the online problem into a series of one-shot optimization problems solvable at each individual time slot by the technique of regularization, and afterwards round the fractional decisions to the integer ones via a carefully-designed dependent rounding method. We prove a parameterized-constant competitive ratio for our online algorithm as the theoretical performance analysis, and also conduct extensive simulation studies to exhibit its close-to-offline-optimum practical performance in realistic settings. |
Persistent Identifier | http://hdl.handle.net/10722/301455 |
ISSN | 2023 Impact Factor: 5.6 2023 SCImago Journal Rankings: 2.340 |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Li, X | - |
dc.contributor.author | Zhou, R | - |
dc.contributor.author | Jiao, L | - |
dc.contributor.author | Wu, C | - |
dc.contributor.author | Deng, Y | - |
dc.contributor.author | Li, Z | - |
dc.date.accessioned | 2021-07-27T08:11:20Z | - |
dc.date.available | 2021-07-27T08:11:20Z | - |
dc.date.issued | 2020 | - |
dc.identifier.citation | IEEE Transactions on Parallel and Distributed Systems, 2020, v. 31 n. 4, p. 948-966 | - |
dc.identifier.issn | 1045-9219 | - |
dc.identifier.uri | http://hdl.handle.net/10722/301455 | - |
dc.description.abstract | Geo-distributed machine learning (ML) often uses large geo-dispersed data collections produced over time to train global models, without consolidating the data to a central site. In the parameter server architecture, “workers” and “parameter servers” for a geo-distributed ML job should be strategically deployed and adjusted on the fly, to allow easy access to the datasets and fast exchange of the model parameters at anytime. Despite many cloud platforms now provide volume discounts to encourage the usage of their ML resources, different geo-distributed ML jobs that run in the clouds often rent cloud resources separately and respectively, thus rarely enjoying the benefit of discounts. We study an ML broker service that aggregates geo-distributed ML jobs into cloud data centers for volume discounts via dynamic online placement and scaling of workers and parameter servers in individual jobs for long-term cost minimization. To decide the number and the placement of workers and parameter servers, we propose an efficient online algorithm which first decomposes the online problem into a series of one-shot optimization problems solvable at each individual time slot by the technique of regularization, and afterwards round the fractional decisions to the integer ones via a carefully-designed dependent rounding method. We prove a parameterized-constant competitive ratio for our online algorithm as the theoretical performance analysis, and also conduct extensive simulation studies to exhibit its close-to-offline-optimum practical performance in realistic settings. | - |
dc.language | eng | - |
dc.publisher | Institute of Electrical and Electronics Engineers. The Journal's web site is located at http://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=71 | - |
dc.relation.ispartof | IEEE Transactions on Parallel and Distributed Systems | - |
dc.rights | IEEE Transactions on Parallel and Distributed Systems. Copyright © Institute of Electrical and Electronics Engineers. | - |
dc.rights | ©20xx IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. | - |
dc.subject | Geo-distributed machine learning | - |
dc.subject | online placement | - |
dc.subject | volume discount brokerage | - |
dc.title | Online Placement and Scaling of Geo-Distributed Machine Learning Jobs via Volume-Discounting Brokerage | - |
dc.type | Article | - |
dc.identifier.email | Wu, C: cwu@cs.hku.hk | - |
dc.identifier.authority | Wu, C=rp01397 | - |
dc.description.nature | link_to_subscribed_fulltext | - |
dc.identifier.doi | 10.1109/TPDS.2019.2955935 | - |
dc.identifier.scopus | eid_2-s2.0-85075918980 | - |
dc.identifier.hkuros | 323507 | - |
dc.identifier.volume | 31 | - |
dc.identifier.issue | 4 | - |
dc.identifier.spage | 948 | - |
dc.identifier.epage | 966 | - |
dc.publisher.place | United States | - |