Reinforcement learning for quadruped robot locomotion control

Shu, Zhengjie; 舒政杰

File Download

FullText.pdf

Supplementary

Citations:
Appears in Collections:
- HKU Theses Online
- Mechanical Engineering: Theses

postgraduate thesis: Reinforcement learning for quadruped robot locomotion control

Title	Reinforcement learning for quadruped robot locomotion control
Authors	Shu, Zhengjie 舒政杰
Issue Date	2024
Publisher	The University of Hong Kong (Pokfulam, Hong Kong)
Citation	Shu, Z. [舒政杰]. (2024). Reinforcement learning for quadruped robot locomotion control. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.
Abstract	Quadruped robots are receiving more attention from academics and industry due to their excellent dynamic locomotion capabilities. Many complex tasks are being explored to be completed by quadruped robots, which also brings challenges with increasing performance requirements. Traditional model predictive control has been proven to have significant effects on quadruped robots. Reinforcement learning has also shown strong potential in controlling flexibility and robustness in quadruped robots. Therefore, this paper mainly discusses the control of quadruped robots in complex terrain using reinforcement learning methods based on model predictive control. The key point of the quadruped robot locomotion control based on model predictive control is to transform the prediction of the future multiple steps into an optimization problem. The author analyzes its force condition through a single rigid body model and lists its state equation and prediction equation. The control problem is transformed into an optimization problem using the prediction equation to calculate the plantar reaction force. Finally, the control command is solved by combining whole-body control. This solution is verified by building a physical quadruped robot. Compared with model predictive control methods, reinforcement learning methods learn the optimal policy by simulating the interaction between the robot and the environment in simulation and maximizing the cumulative rewards. This thesis uses the Isaac Gym platform which can simulate thousands of robots at the same time by the parallel computing characteristics of GPU. The proximal policy optimization algorithm is used to train the policy. In combination with the robot model, the author adjusts the reinforcement learning hyperparameters and observations and sets new tasks and corresponding reward functions to obtain policies suitable for complex and challenging terrains. Due to the error between the physical robot and the simulation model, an unsupervised learning method, multi-layer perceptron, was used to train the actuator network, which rebuilds the motor model, to solve the sim to real problem. Through the actuator net, the author further verifies the robustness and flexibility of the controller based on reinforcement learning on the physical robots. This was verified in complex and challenging indoor and outdoor terrains and international competition.
Degree	Master of Philosophy
Subject	Robots - Control systems Reinforcement learning
Dept/Program	Mechanical Engineering
Persistent Identifier	http://hdl.handle.net/10722/352661

DC Field	Value	Language
dc.contributor.author	Shu, Zhengjie	-
dc.contributor.author	舒政杰	-
dc.date.accessioned	2024-12-19T09:27:04Z	-
dc.date.available	2024-12-19T09:27:04Z	-
dc.date.issued	2024	-
dc.identifier.citation	Shu, Z. [舒政杰]. (2024). Reinforcement learning for quadruped robot locomotion control. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.	-
dc.identifier.uri	http://hdl.handle.net/10722/352661	-
dc.description.abstract	Quadruped robots are receiving more attention from academics and industry due to their excellent dynamic locomotion capabilities. Many complex tasks are being explored to be completed by quadruped robots, which also brings challenges with increasing performance requirements. Traditional model predictive control has been proven to have significant effects on quadruped robots. Reinforcement learning has also shown strong potential in controlling flexibility and robustness in quadruped robots. Therefore, this paper mainly discusses the control of quadruped robots in complex terrain using reinforcement learning methods based on model predictive control. The key point of the quadruped robot locomotion control based on model predictive control is to transform the prediction of the future multiple steps into an optimization problem. The author analyzes its force condition through a single rigid body model and lists its state equation and prediction equation. The control problem is transformed into an optimization problem using the prediction equation to calculate the plantar reaction force. Finally, the control command is solved by combining whole-body control. This solution is verified by building a physical quadruped robot. Compared with model predictive control methods, reinforcement learning methods learn the optimal policy by simulating the interaction between the robot and the environment in simulation and maximizing the cumulative rewards. This thesis uses the Isaac Gym platform which can simulate thousands of robots at the same time by the parallel computing characteristics of GPU. The proximal policy optimization algorithm is used to train the policy. In combination with the robot model, the author adjusts the reinforcement learning hyperparameters and observations and sets new tasks and corresponding reward functions to obtain policies suitable for complex and challenging terrains. Due to the error between the physical robot and the simulation model, an unsupervised learning method, multi-layer perceptron, was used to train the actuator network, which rebuilds the motor model, to solve the sim to real problem. Through the actuator net, the author further verifies the robustness and flexibility of the controller based on reinforcement learning on the physical robots. This was verified in complex and challenging indoor and outdoor terrains and international competition.	-
dc.language	eng	-
dc.publisher	The University of Hong Kong (Pokfulam, Hong Kong)	-
dc.relation.ispartof	HKU Theses Online (HKUTO)	-
dc.rights	The author retains all proprietary rights, (such as patent rights) and the right to use in future works.	-
dc.rights	This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.	-
dc.subject.lcsh	Robots - Control systems	-
dc.subject.lcsh	Reinforcement learning	-
dc.title	Reinforcement learning for quadruped robot locomotion control	-
dc.type	PG_Thesis	-
dc.description.thesisname	Master of Philosophy	-
dc.description.thesislevel	Master	-
dc.description.thesisdiscipline	Mechanical Engineering	-
dc.description.nature	published_or_final_version	-
dc.date.hkucongregation	2024	-
dc.identifier.mmsid	991044891408503414	-

File Download

Supplementary

postgraduate thesis: Reinforcement learning for quadruped robot locomotion control

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats