File Download
Links for fulltext
(May Require Subscription)
- Publisher Website: 10.1109/TWC.2024.3404811
- Scopus: eid_2-s2.0-85194878031
- WOS: WOS:001338574900003
- Find via

Supplementary
- Citations:
- Appears in Collections:
Article: Joint Batching and Scheduling for High-Throughput Multiuser Edge AI with Asynchronous Task Arrivals
| Title | Joint Batching and Scheduling for High-Throughput Multiuser Edge AI with Asynchronous Task Arrivals |
|---|---|
| Authors | |
| Keywords | batching Computational modeling Edge AI edge inference Inference algorithms radio resource allocation Resource management scheduling Servers Task analysis Throughput Wireless communication |
| Issue Date | 1-Oct-2024 |
| Publisher | Institute of Electrical and Electronics Engineers |
| Citation | IEEE Transactions on Wireless Communications, 2024, v. 23, n. 10, p. 13782-13795 How to Cite? |
| Abstract | Edge artificial intelligence (AI) in the sixth-generation networks will provide inference services at the network edge to enrich the capabilities of mobile devices and lengthen their battery lives. As a well-known technique in computing, batching can boost the computation throughput at an edge server by assembling multiple tasks into a batch that is fed into a pre-trained prediction model. This reduces the memory-access frequency and hence accelerates the execution of each task. In this paper, we study joint batching and (task) scheduling to maximise the throughput (i.e., the number of completed tasks) under the practical assumptions of heterogeneous task arrivals and deadlines. The design aims to optimise the number of batches, their starting time instants, and the task-batch association that determines batch sizes. The joint optimisation problem is complex due to multiple coupled variables as mentioned and numerous constraints including heterogeneous tasks arrivals and deadlines, the causality requirements on multi-task execution, and limited radio resources. Our approach of solving the formulated mixed-integer problem is to transform it into a convex problem via integer relaxation method and ℓ0-norm approximation. This results in an efficient alternating optimization algorithm for finding a close-to-optimal solution. Specifically, it iterates between solving two sub-problems, optimal task-batch association and optimal batch starting time. The former is a linear program whose solution can be found using a derived scheme of greedy task selection while that of the latter is derived in closed form. In addition, we also design the optimal algorithm from leveraging spectrum holes, which are caused by fixed bandwidth allocation to devices and their asynchronized multi-batch task execution, to admit unscheduled tasks so as to further enhance throughput. Simulation results demonstrate that the proposed framework of joint batching and resource allocation can substantially enhance the throughput of multiuser edge-AI as opposed to a number of benchmarking schemes. |
| Persistent Identifier | http://hdl.handle.net/10722/351196 |
| ISSN | 2023 Impact Factor: 8.9 2023 SCImago Journal Rankings: 5.371 |
| ISI Accession Number ID |
| DC Field | Value | Language |
|---|---|---|
| dc.contributor.author | Cang, Yihan | - |
| dc.contributor.author | Chen, Ming | - |
| dc.contributor.author | Huang, Kaibin | - |
| dc.date.accessioned | 2024-11-13T00:36:16Z | - |
| dc.date.available | 2024-11-13T00:36:16Z | - |
| dc.date.issued | 2024-10-01 | - |
| dc.identifier.citation | IEEE Transactions on Wireless Communications, 2024, v. 23, n. 10, p. 13782-13795 | - |
| dc.identifier.issn | 1536-1276 | - |
| dc.identifier.uri | http://hdl.handle.net/10722/351196 | - |
| dc.description.abstract | <p>Edge artificial intelligence (AI) in the sixth-generation networks will provide inference services at the network edge to enrich the capabilities of mobile devices and lengthen their battery lives. As a well-known technique in computing, batching can boost the computation throughput at an edge server by assembling multiple tasks into a batch that is fed into a pre-trained prediction model. This reduces the memory-access frequency and hence accelerates the execution of each task. In this paper, we study joint batching and (task) scheduling to maximise the throughput (i.e., the number of completed tasks) under the practical assumptions of heterogeneous task arrivals and deadlines. The design aims to optimise the number of batches, their starting time instants, and the task-batch association that determines batch sizes. The joint optimisation problem is complex due to multiple coupled variables as mentioned and numerous constraints including heterogeneous tasks arrivals and deadlines, the causality requirements on multi-task execution, and limited radio resources. Our approach of solving the formulated mixed-integer problem is to transform it into a convex problem via integer relaxation method and ℓ<sub>0</sub>-norm approximation. This results in an efficient alternating optimization algorithm for finding a close-to-optimal solution. Specifically, it iterates between solving two sub-problems, optimal task-batch association and optimal batch starting time. The former is a linear program whose solution can be found using a derived scheme of greedy task selection while that of the latter is derived in closed form. In addition, we also design the optimal algorithm from leveraging spectrum holes, which are caused by fixed bandwidth allocation to devices and their asynchronized multi-batch task execution, to admit unscheduled tasks so as to further enhance throughput. Simulation results demonstrate that the proposed framework of joint batching and resource allocation can substantially enhance the throughput of multiuser edge-AI as opposed to a number of benchmarking schemes.</p> | - |
| dc.language | eng | - |
| dc.publisher | Institute of Electrical and Electronics Engineers | - |
| dc.relation.ispartof | IEEE Transactions on Wireless Communications | - |
| dc.rights | This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. | - |
| dc.subject | batching | - |
| dc.subject | Computational modeling | - |
| dc.subject | Edge AI | - |
| dc.subject | edge inference | - |
| dc.subject | Inference algorithms | - |
| dc.subject | radio resource allocation | - |
| dc.subject | Resource management | - |
| dc.subject | scheduling | - |
| dc.subject | Servers | - |
| dc.subject | Task analysis | - |
| dc.subject | Throughput | - |
| dc.subject | Wireless communication | - |
| dc.title | Joint Batching and Scheduling for High-Throughput Multiuser Edge AI with Asynchronous Task Arrivals | - |
| dc.type | Article | - |
| dc.description.nature | published_or_final_version | - |
| dc.identifier.doi | 10.1109/TWC.2024.3404811 | - |
| dc.identifier.scopus | eid_2-s2.0-85194878031 | - |
| dc.identifier.volume | 23 | - |
| dc.identifier.issue | 10 | - |
| dc.identifier.spage | 13782 | - |
| dc.identifier.epage | 13795 | - |
| dc.identifier.eissn | 1558-2248 | - |
| dc.identifier.isi | WOS:001338574900003 | - |
| dc.identifier.issnl | 1536-1276 | - |
