A multi-agent reinforcement learning (MARL) framework for designing an optimal state-specific hybrid maintenance policy for a series k-out-of-n load-sharing system

Zhao, Sangqi; Wei, Yian; Li, Yang; Cheng, Yao

File Download

There are no files associated with this item.

Links for fulltext

(May Require Subscription)

Publisher Website: 10.1016/j.ress.2025.111587
Scopus: eid_2-s2.0-105013846326
Find via

Supplementary

Citations:
- Scopus: 0
Appears in Collections:
- Industrial & Manufacturing Systems Engineering: Journal/Magazine Articles

Article: A multi-agent reinforcement learning (MARL) framework for designing an optimal state-specific hybrid maintenance policy for a series k-out-of-n load-sharing system

Title	A multi-agent reinforcement learning (MARL) framework for designing an optimal state-specific hybrid maintenance policy for a series k-out-of-n load-sharing system
Authors	Zhao, Sangqi Wei, Yian Li, Yang Cheng, Yao
Keywords	Markov decision process Multi-agent reinforcement learning Series k-out-of-n load-sharing system State-specific hybrid maintenance policy
Issue Date	1-Jan-2026
Publisher	Elsevier
Citation	Reliability Engineering & System Safety, 2026, v. 265 How to Cite? DOI: http://dx.doi.org/10.1016/j.ress.2025.111587
Abstract	The series k-out-of-n: G load-sharing structure is widely adopted in engineering. During their operations, system components are subject to deterioration that causes system failures and shutdowns. Although maintenance reduces system failure-associated costs, it also requires system shutdown and incurs considerable costs. This calls upon a maintenance policy that minimizes the overall long-term cost rate. When the components have continuous and load-dependent deterioration processes and the maintenance duration is non-negligible, the task becomes especially challenging. In this paper, we propose a Markov decision process (MDP)-based multi-agent reinforcement learning (MARL) framework to obtain an optimal state-specific hybrid maintenance policy that determines the maintenance timing and levels for all components holistically. First, we define the policy that dictates whether each component undergoes imperfect repair or replacement at periodic decision epochs. Second, we establish an MDP-based multi-agent framework to quantify the system's cost rate by defining the state and action spaces, modeling the stochastic transitions of components’ dependent deterioration processes, and formulating a well-calibrated penalty function. Third, we customize a MARL algorithm which leverages neural networks to handle the large state space and integrates the Branching Dueling Network structure to decompose the high-dimensional action space, thereby improving the scalability. A heuristic-enhanced penalty function is designed to avoid suboptimal policies. A power plant case study demonstrates the effectiveness of the proposed policy and underscores the importance of accounting for maintenance duration in policy design.
Persistent Identifier	http://hdl.handle.net/10722/362732
ISSN	0951-8320 2023 Impact Factor: 9.4 2023 SCImago Journal Rankings: 2.028

DC Field	Value	Language
dc.contributor.author	Zhao, Sangqi	-
dc.contributor.author	Wei, Yian	-
dc.contributor.author	Li, Yang	-
dc.contributor.author	Cheng, Yao	-
dc.date.accessioned	2025-09-27T00:35:28Z	-
dc.date.available	2025-09-27T00:35:28Z	-
dc.date.issued	2026-01-01	-
dc.identifier.citation	Reliability Engineering & System Safety, 2026, v. 265	-
dc.identifier.issn	0951-8320	-
dc.identifier.uri	http://hdl.handle.net/10722/362732	-
dc.description.abstract	The series k-out-of-n: G load-sharing structure is widely adopted in engineering. During their operations, system components are subject to deterioration that causes system failures and shutdowns. Although maintenance reduces system failure-associated costs, it also requires system shutdown and incurs considerable costs. This calls upon a maintenance policy that minimizes the overall long-term cost rate. When the components have continuous and load-dependent deterioration processes and the maintenance duration is non-negligible, the task becomes especially challenging. In this paper, we propose a Markov decision process (MDP)-based multi-agent reinforcement learning (MARL) framework to obtain an optimal state-specific hybrid maintenance policy that determines the maintenance timing and levels for all components holistically. First, we define the policy that dictates whether each component undergoes imperfect repair or replacement at periodic decision epochs. Second, we establish an MDP-based multi-agent framework to quantify the system's cost rate by defining the state and action spaces, modeling the stochastic transitions of components’ dependent deterioration processes, and formulating a well-calibrated penalty function. Third, we customize a MARL algorithm which leverages neural networks to handle the large state space and integrates the Branching Dueling Network structure to decompose the high-dimensional action space, thereby improving the scalability. A heuristic-enhanced penalty function is designed to avoid suboptimal policies. A power plant case study demonstrates the effectiveness of the proposed policy and underscores the importance of accounting for maintenance duration in policy design.	-
dc.language	eng	-
dc.publisher	Elsevier	-
dc.relation.ispartof	Reliability Engineering & System Safety	-
dc.subject	Markov decision process	-
dc.subject	Multi-agent reinforcement learning	-
dc.subject	Series k-out-of-n load-sharing system	-
dc.subject	State-specific hybrid maintenance policy	-
dc.title	A multi-agent reinforcement learning (MARL) framework for designing an optimal state-specific hybrid maintenance policy for a series k-out-of-n load-sharing system	-
dc.type	Article	-
dc.identifier.doi	10.1016/j.ress.2025.111587	-
dc.identifier.scopus	eid_2-s2.0-105013846326	-
dc.identifier.volume	265	-
dc.identifier.issnl	0951-8320	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Article: A multi-agent reinforcement learning (MARL) framework for designing an optimal state-specific hybrid maintenance policy for a series k-out-of-n load-sharing system

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats