Perspective-Adaptive Convolutions for Scene Parsing | |
Zhang, Rui1,2; Tang, Sheng1,2; Zhang, Yongdong1,2; Li, Jintao1,2; Yan, Shuicheng3 | |
刊名 | IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE |
2020-04-01 | |
卷号 | 42期号:4页码:909-924 |
关键词 | Shape Standards Strain Proposals Convolutional neural networks Training Task analysis Scene parsing convolutional neural networks perspective-adaptive convolutions context adaptive biases |
ISSN号 | 0162-8828 |
DOI | 10.1109/TPAMI.2018.2890637 |
英文摘要 | Many existing scene parsing methods adopt Convolutional Neural Networks with receptive fields of fixed sizes and shapes, which frequently results in inconsistent predictions of large objects and invisibility of small objects. To tackle this issue, we propose perspective-adaptive convolutions to acquire receptive fields of flexible sizes and shapes during scene parsing. Through adding a new perspective regression layer, we can dynamically infer the position-adaptive perspective coefficient vectors utilized to reshape the convolutional patches. Consequently, the receptive fields can be adjusted automatically according to the various sizes and perspective deformations of the objects in scene images. Our proposed convolutions are differentiable to learn the convolutional parameters and perspective coefficients in an end-to-end way without any extra training supervision of object sizes. Furthermore, considering that the standard convolutions lack contextual information and spatial dependencies, we propose a context adaptive bias to capture both local and global contextual information through average pooling on the local feature patches and global feature maps, followed by flexible attentive summing to the convolutional results. The attentive weights are position-adaptive and context-aware, and can be learned through adding an additional context regression layer. Experiments on Cityscapes and ADE20K datasets well demonstrate the effectiveness of the proposed methods. |
资助项目 | National Natural Science Foundation of China[61525206] ; National Natural Science Foundation of China[61572472] |
WOS研究方向 | Computer Science ; Engineering |
语种 | 英语 |
出版者 | IEEE COMPUTER SOC |
WOS记录号 | WOS:000526541100009 |
内容类型 | 期刊论文 |
源URL | [http://119.78.100.204/handle/2XEOYT63/14202] |
专题 | 中国科学院计算技术研究所期刊论文_英文 |
通讯作者 | Tang, Sheng; Zhang, Yongdong |
作者单位 | 1.Univ Chinese Acad Sci, Beijing 100049, Peoples R China 2.Chinese Acad Sci, Inst Comp Technol, Key Lab Intelligent Informat Proc, Beijing 100190, Peoples R China 3.AI Inst Qihoo 360, Beijing 100025, Peoples R China |
推荐引用方式 GB/T 7714 | Zhang, Rui,Tang, Sheng,Zhang, Yongdong,et al. Perspective-Adaptive Convolutions for Scene Parsing[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,2020,42(4):909-924. |
APA | Zhang, Rui,Tang, Sheng,Zhang, Yongdong,Li, Jintao,&Yan, Shuicheng.(2020).Perspective-Adaptive Convolutions for Scene Parsing.IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,42(4),909-924. |
MLA | Zhang, Rui,et al."Perspective-Adaptive Convolutions for Scene Parsing".IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 42.4(2020):909-924. |
个性服务 |
查看访问统计 |
相关权益政策 |
暂无数据 |
收藏/分享 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论