Perspective-Adaptive Convolutions for Scene Parsing

doi:10.1109/TPAMI.2018.2890637

CORC > 计算技术研究所 > 中国科学院计算技术研究所 > 中国科学院计算技术研究所期刊论文 > 英文

	Perspective-Adaptive Convolutions for Scene Parsing
	Zhang, Rui 1,2; Tang, Sheng 1,2; Zhang, Yongdong 1,2; Li, Jintao 1,2; Yan, Shuicheng 3
刊名	IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
	2020-04-01
卷号	42 期号:4 页码:909-924
关键词	Shape Standards Strain Proposals Convolutional neural networks Training Task analysis Scene parsing convolutional neural networks perspective-adaptive convolutions context adaptive biases
ISSN号	0162-8828
DOI	10.1109/TPAMI.2018.2890637
英文摘要	Many existing scene parsing methods adopt Convolutional Neural Networks with receptive fields of fixed sizes and shapes, which frequently results in inconsistent predictions of large objects and invisibility of small objects. To tackle this issue, we propose perspective-adaptive convolutions to acquire receptive fields of flexible sizes and shapes during scene parsing. Through adding a new perspective regression layer, we can dynamically infer the position-adaptive perspective coefficient vectors utilized to reshape the convolutional patches. Consequently, the receptive fields can be adjusted automatically according to the various sizes and perspective deformations of the objects in scene images. Our proposed convolutions are differentiable to learn the convolutional parameters and perspective coefficients in an end-to-end way without any extra training supervision of object sizes. Furthermore, considering that the standard convolutions lack contextual information and spatial dependencies, we propose a context adaptive bias to capture both local and global contextual information through average pooling on the local feature patches and global feature maps, followed by flexible attentive summing to the convolutional results. The attentive weights are position-adaptive and context-aware, and can be learned through adding an additional context regression layer. Experiments on Cityscapes and ADE20K datasets well demonstrate the effectiveness of the proposed methods.
资助项目	National Natural Science Foundation of China[61525206] ; National Natural Science Foundation of China[61572472]
WOS研究方向	Computer Science ; Engineering
语种	英语
出版者	IEEE COMPUTER SOC
WOS记录号	WOS:000526541100009
内容类型	期刊论文
源URL	[http://119.78.100.204/handle/2XEOYT63/14202]
专题	中国科学院计算技术研究所期刊论文_英文
通讯作者	Tang, Sheng; Zhang, Yongdong
作者单位	1.Univ Chinese Acad Sci, Beijing 100049, Peoples R China 2.Chinese Acad Sci, Inst Comp Technol, Key Lab Intelligent Informat Proc, Beijing 100190, Peoples R China 3.AI Inst Qihoo 360, Beijing 100025, Peoples R China
推荐引用方式 GB/T 7714	Zhang, Rui,Tang, Sheng,Zhang, Yongdong,et al. Perspective-Adaptive Convolutions for Scene Parsing[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,2020,42(4):909-924.
APA	Zhang, Rui,Tang, Sheng,Zhang, Yongdong,Li, Jintao,&Yan, Shuicheng.(2020).Perspective-Adaptive Convolutions for Scene Parsing.IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,42(4),909-924.
MLA	Zhang, Rui,et al."Perspective-Adaptive Convolutions for Scene Parsing".IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 42.4(2020):909-924.