Stacking More Linear Operations with Orthogonal Regularization to Learn Better

	Stacking More Linear Operations with Orthogonal Regularization to Learn Better
	Xu WX(许伟翔)1,2; Cheng J(程健)1,2
	2022-03
会议日期	2022-7
会议地点	线上会议
英文摘要	How to improve the generalization of CNN models has been a long-lasting problem in the deep learning community. This paper presents a runtime parameter/FLOPs-free method to strengthen CNN models by stacking linear convolution operations during training. We show that overparameterization with appropriate regularization can lead to a smooth optimization landscape that improves the performance. Concretely, we propose to add a 1×1 convolutional layer before and after the original k × k convolutional layer respectively, without any non-linear activations between them. In addition, QuasiOrthogonal Regularization is proposed to maintain the added 1 × 1 filters as orthogonal matrixes. After training, those two 1 × 1 layers can be fused into the original k × k layer without changing the original network architecture, leaving no extra computations at inference, i.e. parameter/FLOPs-free.
语种	英语
内容类型	会议论文
源URL	[http://ir.ia.ac.cn/handle/173211/52091]
专题	类脑芯片与系统研究
作者单位	1.中国科学院大学 2.中国科学院自动化研究所
推荐引用方式 GB/T 7714	Xu WX,Cheng J. Stacking More Linear Operations with Orthogonal Regularization to Learn Better[C]. 见:. 线上会议. 2022-7.

个性服务

查看访问统计

相关权益政策

暂无数据

收藏/分享

所有评论 (0)

暂无评论

评注功能仅针对注册用户开放，请您登录

您在知识库使用过程中有什么好的想法或者建议可以反馈给我们。
标题：	*
内容：
Email：	*
验证码：	刷新

相关链接