PSAQ-ViT V2: Toward Accurate and General Data-Free Quantization for Vision Transformers | |
Li, Zhikai1,2; Chen, Mengjuan2; Xiao, Junrui1,2; Gu, Qingyi2 | |
刊名 | IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS |
2023-08-14 | |
页码 | 12 |
关键词 | Data-free quantization model compression patch similarity quantized vision transformers (ViTs) |
ISSN号 | 2162-237X |
DOI | 10.1109/TNNLS.2023.3301007 |
通讯作者 | Gu, Qingyi(qingyi.gu@ia.ac.cn) |
英文摘要 | Data-free quantization can potentially address data privacy and security concerns in model compression and thus has been widely investigated. Recently, patch similarity aware data-free quantization for vision transformers (PSAQ-ViT) designs a relative value metric, patch similarity, to generate data from pretrained vision transformers (ViTs), achieving the first attempt at data-free quantization for ViTs. In this article, we propose PSAQ-ViT V2, a more accurate and general data-free quantization framework for ViTs, built on top of PSAQ-ViT. More specifically, following the patch similarity metric in PSAQ-ViT, we introduce an adaptive teacher-student strategy, which facilitates the constant cyclic evolution of the generated samples and the quantized model in a competitive and interactive fashion under the supervision of the full-precision (FP) model (teacher), thus significantly improving the accuracy of the quantized model. Moreover, without the auxiliary category guidance, we employ the task-and model-independent prior information, making the general-purpose scheme compatible with a broad range of vision tasks and models. Extensive experiments are conducted on various models on image classification, object detection, and semantic segmentation tasks, and PSAQ-ViT V2, with the naive quantization strategy and without access to real-world data, consistently achieves competitive results, showing potential as a powerful baseline on data-free quantization for ViTs. For instance, with Swin-S as the (backbone) model, 8-bit quantization reaches 82.13 top-1 accuracy on ImageNet, 50.9 box AP and 44.1 mask AP on COCO, and 47.2 mean Intersection over Union (mIoU) on ADE20K. We hope that accurate and general PSAQ-ViT V2 can serve as a potential and practice solution in real-world applications involving sensitive data. Code is released and merged at: https://github.com/zkkli/PSAQ-ViT. |
资助项目 | National Key Research and Development Program of China[2022ZD0119402] ; National Natural Science Foundation of China[62276255] |
WOS研究方向 | Computer Science ; Engineering |
语种 | 英语 |
出版者 | IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC |
WOS记录号 | WOS:001051284800001 |
资助机构 | National Key Research and Development Program of China ; National Natural Science Foundation of China |
内容类型 | 期刊论文 |
源URL | [http://ir.ia.ac.cn/handle/173211/53951] |
专题 | 中科院工业视觉智能装备工程实验室 |
通讯作者 | Gu, Qingyi |
作者单位 | 1.Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing 100049, Peoples R China 2.Chinese Acad Sci, Inst Automat, Beijing 100190, Peoples R China |
推荐引用方式 GB/T 7714 | Li, Zhikai,Chen, Mengjuan,Xiao, Junrui,et al. PSAQ-ViT V2: Toward Accurate and General Data-Free Quantization for Vision Transformers[J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS,2023:12. |
APA | Li, Zhikai,Chen, Mengjuan,Xiao, Junrui,&Gu, Qingyi.(2023).PSAQ-ViT V2: Toward Accurate and General Data-Free Quantization for Vision Transformers.IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS,12. |
MLA | Li, Zhikai,et al."PSAQ-ViT V2: Toward Accurate and General Data-Free Quantization for Vision Transformers".IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS (2023):12. |
个性服务 |
查看访问统计 |
相关权益政策 |
暂无数据 |
收藏/分享 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论