Multi-Agent Reinforcement Learning-Based Resource Allocation for UAV Networks

Cui, Jingjing; Liu, Yuanwei; Nallanathan, Arumugam

File Download

There are no files associated with this item.

Links for fulltext

(May Require Subscription)

Publisher Website: 10.1109/TWC.2019.2935201
Scopus: eid_2-s2.0-85079784199
Find via

Supplementary

Citations:
- Scopus: 0
Appears in Collections:
- Electrical & Electronic Engineering: Journal/Magazine Articles

Article: Multi-Agent Reinforcement Learning-Based Resource Allocation for UAV Networks

Title	Multi-Agent Reinforcement Learning-Based Resource Allocation for UAV Networks
Authors	Cui, Jingjing Liu, Yuanwei Nallanathan, Arumugam
Keywords	Dynamic resource allocation multi-agent reinforcement learning (MARL) stochastic games UAV communications
Issue Date	2020
Citation	IEEE Transactions on Wireless Communications, 2020, v. 19, n. 2, p. 729-743 How to Cite? DOI: http://dx.doi.org/10.1109/TWC.2019.2935201
Abstract	Unmanned aerial vehicles (UAVs) are capable of serving as aerial base stations (BSs) for providing both cost-effective and on-demand wireless communications. This article investigates dynamic resource allocation of multiple UAVs enabled communication networks with the goal of maximizing long-term rewards. More particularly, each UAV communicates with a ground user by automatically selecting its communicating user, power level and subchannel without any information exchange among UAVs. To model the dynamics and uncertainty in environments, we formulate the long-term resource allocation problem as a stochastic game for maximizing the expected rewards, where each UAV becomes a learning agent and each resource allocation solution corresponds to an action taken by the UAVs. Afterwards, we develop a multi-agent reinforcement learning (MARL) framework that each agent discovers its best strategy according to its local observations using learning. More specifically, we propose an agent-independent method, for which all agents conduct a decision algorithm independently but share a common structure based on Q-learning. Finally, simulation results reveal that: 1) appropriate parameters for exploitation and exploration are capable of enhancing the performance of the proposed MARL based resource allocation algorithm; 2) the proposed MARL algorithm provides acceptable performance compared to the case with complete information exchanges among UAVs. By doing so, it strikes a good tradeoff between performance gains and information exchange overheads.
Persistent Identifier	http://hdl.handle.net/10722/349404
ISSN	1536-1276 2023 Impact Factor: 8.9 2023 SCImago Journal Rankings: 5.371

DC Field	Value	Language
dc.contributor.author	Cui, Jingjing	-
dc.contributor.author	Liu, Yuanwei	-
dc.contributor.author	Nallanathan, Arumugam	-
dc.date.accessioned	2024-10-17T06:58:18Z	-
dc.date.available	2024-10-17T06:58:18Z	-
dc.date.issued	2020	-
dc.identifier.citation	IEEE Transactions on Wireless Communications, 2020, v. 19, n. 2, p. 729-743	-
dc.identifier.issn	1536-1276	-
dc.identifier.uri	http://hdl.handle.net/10722/349404	-
dc.description.abstract	Unmanned aerial vehicles (UAVs) are capable of serving as aerial base stations (BSs) for providing both cost-effective and on-demand wireless communications. This article investigates dynamic resource allocation of multiple UAVs enabled communication networks with the goal of maximizing long-term rewards. More particularly, each UAV communicates with a ground user by automatically selecting its communicating user, power level and subchannel without any information exchange among UAVs. To model the dynamics and uncertainty in environments, we formulate the long-term resource allocation problem as a stochastic game for maximizing the expected rewards, where each UAV becomes a learning agent and each resource allocation solution corresponds to an action taken by the UAVs. Afterwards, we develop a multi-agent reinforcement learning (MARL) framework that each agent discovers its best strategy according to its local observations using learning. More specifically, we propose an agent-independent method, for which all agents conduct a decision algorithm independently but share a common structure based on Q-learning. Finally, simulation results reveal that: 1) appropriate parameters for exploitation and exploration are capable of enhancing the performance of the proposed MARL based resource allocation algorithm; 2) the proposed MARL algorithm provides acceptable performance compared to the case with complete information exchanges among UAVs. By doing so, it strikes a good tradeoff between performance gains and information exchange overheads.	-
dc.language	eng	-
dc.relation.ispartof	IEEE Transactions on Wireless Communications	-
dc.subject	Dynamic resource allocation	-
dc.subject	multi-agent reinforcement learning (MARL)	-
dc.subject	stochastic games	-
dc.subject	UAV communications	-
dc.title	Multi-Agent Reinforcement Learning-Based Resource Allocation for UAV Networks	-
dc.type	Article	-
dc.description.nature	link_to_subscribed_fulltext	-
dc.identifier.doi	10.1109/TWC.2019.2935201	-
dc.identifier.scopus	eid_2-s2.0-85079784199	-
dc.identifier.volume	19	-
dc.identifier.issue	2	-
dc.identifier.spage	729	-
dc.identifier.epage	743	-
dc.identifier.eissn	1558-2248	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Article: Multi-Agent Reinforcement Learning-Based Resource Allocation for UAV Networks

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats