File Download
  Links for fulltext
     (May Require Subscription)
Supplementary

Conference Paper: Double action Q-learning for obstacle avoidance in a dynamically changing environment

TitleDouble action Q-learning for obstacle avoidance in a dynamically changing environment
Authors
KeywordsObstacle avoidance
Q-learning
Reinforcement learning
Temporal differences
Issue Date2005
PublisherIEEE.
Citation
Ieee Intelligent Vehicles Symposium, Proceedings, 2005, v. 2005, p. 211-216 How to Cite?
AbstractIn this paper, we propose a new method for solving the reinforcement learning problem in a dynamically changing environment, as in vehicle navigation, in which the Markov Decision Process used in traditional reinforcement learning is modified so that the response of the environment is taken into consideration for determining the agent's next state. This is achieved by changing the action-value function to handle three parameters at a time, namely, the current state, action taken by the agent, and action taken by the environment. As it considers the actions by the agent and environment, it is termed "Double Action". Based on the Q-learning method, the proposed method is implemented and the update rule is modified to handle all of the three parameters. Preliminary results show that the proposed method has the sum of rewards (negative) 89.5% less than that of the traditional method. Apart form that, our new method also has the total number of collisions and mean steps used in one episode 89.5% and 15.5% lower than that of the traditional method respectively. © 2005 IEEE.
Persistent Identifierhttp://hdl.handle.net/10722/45776
References

 

DC FieldValueLanguage
dc.contributor.authorNgai, DCKen_HK
dc.contributor.authorYung, NHCen_HK
dc.date.accessioned2007-10-30T06:35:13Z-
dc.date.available2007-10-30T06:35:13Z-
dc.date.issued2005en_HK
dc.identifier.citationIeee Intelligent Vehicles Symposium, Proceedings, 2005, v. 2005, p. 211-216en_HK
dc.identifier.urihttp://hdl.handle.net/10722/45776-
dc.description.abstractIn this paper, we propose a new method for solving the reinforcement learning problem in a dynamically changing environment, as in vehicle navigation, in which the Markov Decision Process used in traditional reinforcement learning is modified so that the response of the environment is taken into consideration for determining the agent's next state. This is achieved by changing the action-value function to handle three parameters at a time, namely, the current state, action taken by the agent, and action taken by the environment. As it considers the actions by the agent and environment, it is termed "Double Action". Based on the Q-learning method, the proposed method is implemented and the update rule is modified to handle all of the three parameters. Preliminary results show that the proposed method has the sum of rewards (negative) 89.5% less than that of the traditional method. Apart form that, our new method also has the total number of collisions and mean steps used in one episode 89.5% and 15.5% lower than that of the traditional method respectively. © 2005 IEEE.en_HK
dc.format.extent1037213 bytes-
dc.format.extent10863 bytes-
dc.format.mimetypeapplication/pdf-
dc.format.mimetypetext/plain-
dc.languageengen_HK
dc.publisherIEEE.en_HK
dc.relation.ispartofIEEE Intelligent Vehicles Symposium, Proceedingsen_HK
dc.rightsCreative Commons: Attribution 3.0 Hong Kong License-
dc.rights©2005 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.en_HK
dc.subjectObstacle avoidanceen_HK
dc.subjectQ-learningen_HK
dc.subjectReinforcement learningen_HK
dc.subjectTemporal differencesen_HK
dc.titleDouble action Q-learning for obstacle avoidance in a dynamically changing environmenten_HK
dc.typeConference_Paperen_HK
dc.identifier.emailYung, NHC:nyung@eee.hku.hken_HK
dc.identifier.authorityYung, NHC=rp00226en_HK
dc.description.naturepublished_or_final_versionen_HK
dc.identifier.doi10.1109/IVS.2005.1505104en_HK
dc.identifier.scopuseid_2-s2.0-27944435493en_HK
dc.identifier.hkuros102239-
dc.relation.referenceshttp://www.scopus.com/mlt/select.url?eid=2-s2.0-27944435493&selection=ref&src=s&origin=recordpageen_HK
dc.identifier.volume2005en_HK
dc.identifier.spage211en_HK
dc.identifier.epage216en_HK
dc.identifier.scopusauthoridNgai, DCK=9332358900en_HK
dc.identifier.scopusauthoridYung, NHC=7003473369en_HK

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats