Prediction of the transcription factor binding sites with meta-learning
Jing, Fang1; Zhang, Shao-Wu1,5; Zhang, Shihua2,3,4,6
刊名METHODS
2022-07-01
卷号203页码:207-213
关键词Convolution neural network Transcription factor binding sites Meta learning Noisy labels data
ISSN号1046-2023
DOI10.1016/j.ymeth.2022.04.010
英文摘要With the accumulation of ChIP-seq data, convolution neural network (CNN)-based methods have been proposed for predicting transcription factor binding sites (TFBSs). However, biological experimental data are noisy, and are often treated as ground truth for both training and testing. Particularly, existing classification methods ignore the false positive and false negative which are caused by the error in the peak calling stage, and therefore, they can easily overfit to biased training data. It leads to inaccurate identification and inability to reveal the rules of governing protein-DNA binding. To address this issue, we proposed a meta learning-based CNN method (namely TFBS_MLCNN or MLCNN for short) for suppressing the influence of noisy labels data and accurately recognizing TFBSs from ChIP-seq data. Guided by a small amount of unbiased meta-data, MLCNN can adaptively learn an explicit weighting function from ChIP-seq data and update the parameter of classifier simultaneously. The weighting function overcomes the influence of biased training data on classifier by assigning a weight to each sample according to its training loss. The experimental results on 424 ChIP-seq datasets show that MLCNN not only outperforms other existing state-of-the-art CNN methods, but can also detect noisy samples which are given the small weights to suppress them. The suppression ability to the noisy samples can be revealed through the visualization of samples' weights. Several case studies demonstrate that MLCNN has superior performance to others.
资助项目National Natural Science Foundation of China[61873202,62173271] ; National Natural Science Foundation of China[61621003] ; Strategic Priority Research Program of the Chinese Academy of Sciences (CAS)[XDPB17] ; National Ten Thousand Talent Pro-gram for Young Top-notch Talents[QYZDB-SSW-SYS008] ; CAS Frontier Science Research Key Project for Top Young Scientist
WOS研究方向Biochemistry & Molecular Biology
语种英语
出版者ACADEMIC PRESS INC ELSEVIER SCIENCE
WOS记录号WOS:000809934400008
内容类型期刊论文
源URL[http://ir.amss.ac.cn/handle/2S8OKBNM/61553]  
专题应用数学研究所
通讯作者Zhang, Shao-Wu; Zhang, Shihua
作者单位1.Northwestern Polytech Univ, Sch Automat, MOE Key Lab Informat Fus Technol, Xi'an 710072, Peoples R China
2.Univ Chinese Acad Sci, Sch Math Sci, Beijing 100049, Peoples R China
3.Chinese Acad Sci, Acad Math & Syst Sci, NCMIS, CEMS,RCSDS, Beijing 100190, Peoples R China
4.Chinese Acad Sci, Acad Math & Syst Sci, Beijing, Peoples R China
5.Northwestern Polytech Univ, Sch Automat, Xi'an, Peoples R China
6.Chinese Acad Sci, Ctr Excellence Anim Evolut & Genet, Kunming 650223, Peoples R China
推荐引用方式
GB/T 7714
Jing, Fang,Zhang, Shao-Wu,Zhang, Shihua. Prediction of the transcription factor binding sites with meta-learning[J]. METHODS,2022,203:207-213.
APA Jing, Fang,Zhang, Shao-Wu,&Zhang, Shihua.(2022).Prediction of the transcription factor binding sites with meta-learning.METHODS,203,207-213.
MLA Jing, Fang,et al."Prediction of the transcription factor binding sites with meta-learning".METHODS 203(2022):207-213.
个性服务
查看访问统计
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。


©版权所有 ©2017 CSpace - Powered by CSpace