Time-sequence Action-Decision and Navigation Through Stage Deep Reinforcement Learning in Complex Dynamic Environments
Huimu, Wang1,3; Tenghai, Qiu1; Zhen, Liu1; Zhiqiang, Pu1,3; Jianqiang, Yi1,3; Zhaoyang, Liu2
2019
会议日期2019.12
会议地点厦门
英文摘要

Navigation in a complex dynamic environment is one of the most attractive tasks. Although most of such algorithms can achieve navigation tasks effectively, they ignore the necessity of the mission planning in the process of navigation. Given the situation, a novel end-to-end two-stage deep reinforcement learning architecture for a time-sequence navigation and action-decision in a dynamic environment with randomly rapidly
moving obstacles is proposed in this paper. During the first-stage
training, a network with spatial and temporal information is designed to process the navigation task while a conventional recurrent full-connected network is adopted to resolve the action-decision task. During the second-stage training, the two networks are integrated and trained online with dynamic entropy to obtain a stable policy for dynamic missions. Simulations demonstrate that the navigation and the action-decision in
different environments can be completed effectively under our architecture.
 

语种英语
内容类型会议论文
源URL[http://ir.ia.ac.cn/handle/173211/44953]  
专题综合信息系统研究中心_飞行器智能技术
通讯作者Tenghai, Qiu
作者单位1.Institute of Automation, Chinese Academy of Sciences
2.Department of Automation, Tsinghua University
3.School of Artificial Intelligence, University of Chinese Academy of Sciences
推荐引用方式
GB/T 7714
Huimu, Wang,Tenghai, Qiu,Zhen, Liu,et al. Time-sequence Action-Decision and Navigation Through Stage Deep Reinforcement Learning in Complex Dynamic Environments[C]. 见:. 厦门. 2019.12.
个性服务
查看访问统计
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。


©版权所有 ©2017 CSpace - Powered by CSpace