File Download
There are no files associated with this item.
Links for fulltext
(May Require Subscription)
- Publisher Website: 10.1109/TWC.2019.2935201
- Scopus: eid_2-s2.0-85079784199
- Find via
Supplementary
-
Citations:
- Scopus: 0
- Appears in Collections:
Article: Multi-Agent Reinforcement Learning-Based Resource Allocation for UAV Networks
Title | Multi-Agent Reinforcement Learning-Based Resource Allocation for UAV Networks |
---|---|
Authors | |
Keywords | Dynamic resource allocation multi-agent reinforcement learning (MARL) stochastic games UAV communications |
Issue Date | 2020 |
Citation | IEEE Transactions on Wireless Communications, 2020, v. 19, n. 2, p. 729-743 How to Cite? |
Abstract | Unmanned aerial vehicles (UAVs) are capable of serving as aerial base stations (BSs) for providing both cost-effective and on-demand wireless communications. This article investigates dynamic resource allocation of multiple UAVs enabled communication networks with the goal of maximizing long-term rewards. More particularly, each UAV communicates with a ground user by automatically selecting its communicating user, power level and subchannel without any information exchange among UAVs. To model the dynamics and uncertainty in environments, we formulate the long-term resource allocation problem as a stochastic game for maximizing the expected rewards, where each UAV becomes a learning agent and each resource allocation solution corresponds to an action taken by the UAVs. Afterwards, we develop a multi-agent reinforcement learning (MARL) framework that each agent discovers its best strategy according to its local observations using learning. More specifically, we propose an agent-independent method, for which all agents conduct a decision algorithm independently but share a common structure based on Q-learning. Finally, simulation results reveal that: 1) appropriate parameters for exploitation and exploration are capable of enhancing the performance of the proposed MARL based resource allocation algorithm; 2) the proposed MARL algorithm provides acceptable performance compared to the case with complete information exchanges among UAVs. By doing so, it strikes a good tradeoff between performance gains and information exchange overheads. |
Persistent Identifier | http://hdl.handle.net/10722/349404 |
ISSN | 2023 Impact Factor: 8.9 2023 SCImago Journal Rankings: 5.371 |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Cui, Jingjing | - |
dc.contributor.author | Liu, Yuanwei | - |
dc.contributor.author | Nallanathan, Arumugam | - |
dc.date.accessioned | 2024-10-17T06:58:18Z | - |
dc.date.available | 2024-10-17T06:58:18Z | - |
dc.date.issued | 2020 | - |
dc.identifier.citation | IEEE Transactions on Wireless Communications, 2020, v. 19, n. 2, p. 729-743 | - |
dc.identifier.issn | 1536-1276 | - |
dc.identifier.uri | http://hdl.handle.net/10722/349404 | - |
dc.description.abstract | Unmanned aerial vehicles (UAVs) are capable of serving as aerial base stations (BSs) for providing both cost-effective and on-demand wireless communications. This article investigates dynamic resource allocation of multiple UAVs enabled communication networks with the goal of maximizing long-term rewards. More particularly, each UAV communicates with a ground user by automatically selecting its communicating user, power level and subchannel without any information exchange among UAVs. To model the dynamics and uncertainty in environments, we formulate the long-term resource allocation problem as a stochastic game for maximizing the expected rewards, where each UAV becomes a learning agent and each resource allocation solution corresponds to an action taken by the UAVs. Afterwards, we develop a multi-agent reinforcement learning (MARL) framework that each agent discovers its best strategy according to its local observations using learning. More specifically, we propose an agent-independent method, for which all agents conduct a decision algorithm independently but share a common structure based on Q-learning. Finally, simulation results reveal that: 1) appropriate parameters for exploitation and exploration are capable of enhancing the performance of the proposed MARL based resource allocation algorithm; 2) the proposed MARL algorithm provides acceptable performance compared to the case with complete information exchanges among UAVs. By doing so, it strikes a good tradeoff between performance gains and information exchange overheads. | - |
dc.language | eng | - |
dc.relation.ispartof | IEEE Transactions on Wireless Communications | - |
dc.subject | Dynamic resource allocation | - |
dc.subject | multi-agent reinforcement learning (MARL) | - |
dc.subject | stochastic games | - |
dc.subject | UAV communications | - |
dc.title | Multi-Agent Reinforcement Learning-Based Resource Allocation for UAV Networks | - |
dc.type | Article | - |
dc.description.nature | link_to_subscribed_fulltext | - |
dc.identifier.doi | 10.1109/TWC.2019.2935201 | - |
dc.identifier.scopus | eid_2-s2.0-85079784199 | - |
dc.identifier.volume | 19 | - |
dc.identifier.issue | 2 | - |
dc.identifier.spage | 729 | - |
dc.identifier.epage | 743 | - |
dc.identifier.eissn | 1558-2248 | - |