Intrinsic Reward with Peer Incentives for Cooperative Multi-Agent Reinforcement Learning

CORC > 自动化研究所 > 中国科学院自动化研究所 > 综合信息系统研究中心 > 飞行器智能技术

	Intrinsic Reward with Peer Incentives for Cooperative Multi-Agent Reinforcement Learning
	Zhang TL(张天乐)1,2; Liu Z(刘振)1,2; Wu SG(吴士广)1,2; Pu ZQ(蒲志强)1,2; Yi JQ(易建强)1,2
	2022
会议日期	18-23 July 2022
会议地点	Online
英文摘要	In this paper, we propose a novel Intrinsic Reward method with Peer Incentives (IRPI) to promote the inter-agent direct interactions and implicitly address the credit assignment problem in cooperative multi-agent reinforcement learning (MARL). The IRPI method can build mutual incentives between agents by using their causal effect, to realize their advanced cooperation. Specifically, a new intrinsic reward mechanism is conducted, which equips each agent with the ability to reward other agent by using the causal effect between them. Moreover, the mechanism is built through a neural network and learned by using causal effect between the agents. Furthermore, the counterfactual reasoning is used to infer the causal effect between the agents using the joint action-state value function, and then assess the quality of the effect using individual state value function in MARL. Simulational results in Starcraft II Micromanagement demonstrate that the proposed IRPI can enhance cooperation among the RL agents to achieve better performance than some state-of-the-art MARL methods in various cooperative multi-agent tasks.
会议录出版者	IEEE
语种	英语
内容类型	会议论文
源URL	[http://ir.ia.ac.cn/handle/173211/51961]
专题	综合信息系统研究中心_飞行器智能技术
通讯作者	Liu Z(刘振)
作者单位	1.中国科学院自动化研究所 2.中国科学院大学人工智能学院
推荐引用方式 GB/T 7714	Zhang TL,Liu Z,Wu SG,et al. Intrinsic Reward with Peer Incentives for Cooperative Multi-Agent Reinforcement Learning[C]. 见:. Online. 18-23 July 2022.

个性服务

查看访问统计

相关权益政策

暂无数据

收藏/分享

所有评论 (0)

[发表评论/异议/意见]

暂无评论

评论
权益异议
反馈意见

评注功能仅针对注册用户开放，请您登录

您对该条目有什么异议，请向管理员反馈。
内容：
Email：	*
单位:
验证码：	刷新

您在知识库使用过程中有什么好的想法或者建议可以反馈给我们。
标题：	*
内容：
Email：	*
验证码：	刷新

相关链接

CORC

联系我们