PSAQ-ViT V2: Toward Accurate and General Data-Free Quantization for Vision Transformers
Li, Zhikai1,2; Chen, Mengjuan2; Xiao, Junrui1,2; Gu, Qingyi2
刊名IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS
2023-08-14
页码12
关键词Data-free quantization model compression patch similarity quantized vision transformers (ViTs)
ISSN号2162-237X
DOI10.1109/TNNLS.2023.3301007
通讯作者Gu, Qingyi(qingyi.gu@ia.ac.cn)
英文摘要Data-free quantization can potentially address data privacy and security concerns in model compression and thus has been widely investigated. Recently, patch similarity aware data-free quantization for vision transformers (PSAQ-ViT) designs a relative value metric, patch similarity, to generate data from pretrained vision transformers (ViTs), achieving the first attempt at data-free quantization for ViTs. In this article, we propose PSAQ-ViT V2, a more accurate and general data-free quantization framework for ViTs, built on top of PSAQ-ViT. More specifically, following the patch similarity metric in PSAQ-ViT, we introduce an adaptive teacher-student strategy, which facilitates the constant cyclic evolution of the generated samples and the quantized model in a competitive and interactive fashion under the supervision of the full-precision (FP) model (teacher), thus significantly improving the accuracy of the quantized model. Moreover, without the auxiliary category guidance, we employ the task-and model-independent prior information, making the general-purpose scheme compatible with a broad range of vision tasks and models. Extensive experiments are conducted on various models on image classification, object detection, and semantic segmentation tasks, and PSAQ-ViT V2, with the naive quantization strategy and without access to real-world data, consistently achieves competitive results, showing potential as a powerful baseline on data-free quantization for ViTs. For instance, with Swin-S as the (backbone) model, 8-bit quantization reaches 82.13 top-1 accuracy on ImageNet, 50.9 box AP and 44.1 mask AP on COCO, and 47.2 mean Intersection over Union (mIoU) on ADE20K. We hope that accurate and general PSAQ-ViT V2 can serve as a potential and practice solution in real-world applications involving sensitive data. Code is released and merged at: https://github.com/zkkli/PSAQ-ViT.
资助项目National Key Research and Development Program of China[2022ZD0119402] ; National Natural Science Foundation of China[62276255]
WOS研究方向Computer Science ; Engineering
语种英语
出版者IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
WOS记录号WOS:001051284800001
资助机构National Key Research and Development Program of China ; National Natural Science Foundation of China
内容类型期刊论文
源URL[http://ir.ia.ac.cn/handle/173211/53951]  
专题中科院工业视觉智能装备工程实验室
通讯作者Gu, Qingyi
作者单位1.Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing 100049, Peoples R China
2.Chinese Acad Sci, Inst Automat, Beijing 100190, Peoples R China
推荐引用方式
GB/T 7714
Li, Zhikai,Chen, Mengjuan,Xiao, Junrui,et al. PSAQ-ViT V2: Toward Accurate and General Data-Free Quantization for Vision Transformers[J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS,2023:12.
APA Li, Zhikai,Chen, Mengjuan,Xiao, Junrui,&Gu, Qingyi.(2023).PSAQ-ViT V2: Toward Accurate and General Data-Free Quantization for Vision Transformers.IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS,12.
MLA Li, Zhikai,et al."PSAQ-ViT V2: Toward Accurate and General Data-Free Quantization for Vision Transformers".IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS (2023):12.
个性服务
查看访问统计
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。


©版权所有 ©2017 CSpace - Powered by CSpace