File Download

There are no files associated with this item.

  Links for fulltext
     (May Require Subscription)
Supplementary

Article: A multi-agent reinforcement learning (MARL) framework for designing an optimal state-specific hybrid maintenance policy for a series k-out-of-n load-sharing system

TitleA multi-agent reinforcement learning (MARL) framework for designing an optimal state-specific hybrid maintenance policy for a series k-out-of-n load-sharing system
Authors
KeywordsMarkov decision process
Multi-agent reinforcement learning
Series k-out-of-n load-sharing system
State-specific hybrid maintenance policy
Issue Date1-Jan-2026
PublisherElsevier
Citation
Reliability Engineering & System Safety, 2026, v. 265 How to Cite?
AbstractThe series k-out-of-n: G load-sharing structure is widely adopted in engineering. During their operations, system components are subject to deterioration that causes system failures and shutdowns. Although maintenance reduces system failure-associated costs, it also requires system shutdown and incurs considerable costs. This calls upon a maintenance policy that minimizes the overall long-term cost rate. When the components have continuous and load-dependent deterioration processes and the maintenance duration is non-negligible, the task becomes especially challenging. In this paper, we propose a Markov decision process (MDP)-based multi-agent reinforcement learning (MARL) framework to obtain an optimal state-specific hybrid maintenance policy that determines the maintenance timing and levels for all components holistically. First, we define the policy that dictates whether each component undergoes imperfect repair or replacement at periodic decision epochs. Second, we establish an MDP-based multi-agent framework to quantify the system's cost rate by defining the state and action spaces, modeling the stochastic transitions of components’ dependent deterioration processes, and formulating a well-calibrated penalty function. Third, we customize a MARL algorithm which leverages neural networks to handle the large state space and integrates the Branching Dueling Network structure to decompose the high-dimensional action space, thereby improving the scalability. A heuristic-enhanced penalty function is designed to avoid suboptimal policies. A power plant case study demonstrates the effectiveness of the proposed policy and underscores the importance of accounting for maintenance duration in policy design.
Persistent Identifierhttp://hdl.handle.net/10722/362732
ISSN
2023 Impact Factor: 9.4
2023 SCImago Journal Rankings: 2.028

 

DC FieldValueLanguage
dc.contributor.authorZhao, Sangqi-
dc.contributor.authorWei, Yian-
dc.contributor.authorLi, Yang-
dc.contributor.authorCheng, Yao-
dc.date.accessioned2025-09-27T00:35:28Z-
dc.date.available2025-09-27T00:35:28Z-
dc.date.issued2026-01-01-
dc.identifier.citationReliability Engineering & System Safety, 2026, v. 265-
dc.identifier.issn0951-8320-
dc.identifier.urihttp://hdl.handle.net/10722/362732-
dc.description.abstractThe series k-out-of-n: G load-sharing structure is widely adopted in engineering. During their operations, system components are subject to deterioration that causes system failures and shutdowns. Although maintenance reduces system failure-associated costs, it also requires system shutdown and incurs considerable costs. This calls upon a maintenance policy that minimizes the overall long-term cost rate. When the components have continuous and load-dependent deterioration processes and the maintenance duration is non-negligible, the task becomes especially challenging. In this paper, we propose a Markov decision process (MDP)-based multi-agent reinforcement learning (MARL) framework to obtain an optimal state-specific hybrid maintenance policy that determines the maintenance timing and levels for all components holistically. First, we define the policy that dictates whether each component undergoes imperfect repair or replacement at periodic decision epochs. Second, we establish an MDP-based multi-agent framework to quantify the system's cost rate by defining the state and action spaces, modeling the stochastic transitions of components’ dependent deterioration processes, and formulating a well-calibrated penalty function. Third, we customize a MARL algorithm which leverages neural networks to handle the large state space and integrates the Branching Dueling Network structure to decompose the high-dimensional action space, thereby improving the scalability. A heuristic-enhanced penalty function is designed to avoid suboptimal policies. A power plant case study demonstrates the effectiveness of the proposed policy and underscores the importance of accounting for maintenance duration in policy design.-
dc.languageeng-
dc.publisherElsevier-
dc.relation.ispartofReliability Engineering & System Safety-
dc.subjectMarkov decision process-
dc.subjectMulti-agent reinforcement learning-
dc.subjectSeries k-out-of-n load-sharing system-
dc.subjectState-specific hybrid maintenance policy-
dc.titleA multi-agent reinforcement learning (MARL) framework for designing an optimal state-specific hybrid maintenance policy for a series k-out-of-n load-sharing system-
dc.typeArticle-
dc.identifier.doi10.1016/j.ress.2025.111587-
dc.identifier.scopuseid_2-s2.0-105013846326-
dc.identifier.volume265-
dc.identifier.issnl0951-8320-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats