Improving Inconspicuous Attributes Modeling for Person Search by Language
Niu, Kai2,3; Huang, Tao2,3; Huang, Linjiang4; Wang, Liang1; Zhang, Yanning3
刊名IEEE TRANSACTIONS ON IMAGE PROCESSING
2023
卷号32页码:3429-3441
关键词Person search by language cross-modal retrieval smart video surveillance
ISSN号1057-7149
DOI10.1109/TIP.2023.3285426
通讯作者Niu, Kai(kai.niu@nwpu.edu.cn) ; Huang, Linjiang(ljhuang524@gmail.com)
英文摘要Person search by language aims to retrieve the interested pedestrian images based on natural language sentences. Although great efforts have been made to address the cross-modal heterogeneity, most of the current solutions suffer from only capturing salient attributes while ignoring inconspicuous ones, being weak in distinguishing very similar pedestrians. In this work, we propose the Adaptive Salient Attribute Mask Network (ASAMN) to adaptively mask the salient attributes for cross-modal alignments, and therefore induce the model to simultaneously focus on inconspicuous attributes. Specifically, we consider the uni-modal and cross-modal relations for masking salient attributes in the Uni-modal Salient Attribute Mask (USAM) and Cross-modal Salient Attribute Mask (CSAM) modules, respectively. Then the Attribute Modeling Balance (AMB) module is presented to randomly select a proportion of masked features for cross-modal alignments, ensuring the balance of modeling capacity of both salient attributes and inconspicuous ones. Extensive experiments and analyses have been carried out to validate the effectiveness and generalization capacity of our proposed ASAMN method, and we have obtained the state-of-the-art retrieval performance on the widely-used CUHK-PEDES and ICFG-PEDES benchmarks.
资助项目National Natural Science Foundation of China[62101451] ; National Natural Science Foundation of China[U19B2037] ; Guangdong Basic and Applied Basic Research Foundation[2023A1515011427] ; Fundamental Research Funds for the Central Universities[D5000210733]
WOS关键词VIDEO ; IMAGE
WOS研究方向Computer Science ; Engineering
语种英语
出版者IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
WOS记录号WOS:001017283200006
资助机构National Natural Science Foundation of China ; Guangdong Basic and Applied Basic Research Foundation ; Fundamental Research Funds for the Central Universities
内容类型期刊论文
源URL[http://ir.ia.ac.cn/handle/173211/53678]  
专题多模态人工智能系统全国重点实验室
通讯作者Niu, Kai; Huang, Linjiang
作者单位1.Chinese Acad Sci, Inst Automat, Beijing 100190, Peoples R China
2.Northwestern Polytech Univ Shenzhen, Res & Dev Inst, Shenzhen 518063, Peoples R China
3.Northwestern Polytech Univ, Sch Comp Sci, Natl Engn Lab Integrated Aerosp Ground Ocean Big D, Xian 710129, Peoples R China
4.Chinese Univ Hong Kong, Multimedia Lab, Hong Kong, Peoples R China
推荐引用方式
GB/T 7714
Niu, Kai,Huang, Tao,Huang, Linjiang,et al. Improving Inconspicuous Attributes Modeling for Person Search by Language[J]. IEEE TRANSACTIONS ON IMAGE PROCESSING,2023,32:3429-3441.
APA Niu, Kai,Huang, Tao,Huang, Linjiang,Wang, Liang,&Zhang, Yanning.(2023).Improving Inconspicuous Attributes Modeling for Person Search by Language.IEEE TRANSACTIONS ON IMAGE PROCESSING,32,3429-3441.
MLA Niu, Kai,et al."Improving Inconspicuous Attributes Modeling for Person Search by Language".IEEE TRANSACTIONS ON IMAGE PROCESSING 32(2023):3429-3441.
个性服务
查看访问统计
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。


©版权所有 ©2017 CSpace - Powered by CSpace