CORC  > 北京大学  > 信息科学技术学院
DVD: A Model for Event Diversified Versions Discovery
Kong, Liang ; Yan, Rui ; He, Yijun ; Zhang, Yan ; Zhang, Zhenwei ; Fu, Li
2011
关键词Diversified Versions Discovery popular words selection
英文摘要With the development of the techniques of Event Detection and Tracking, it is feasible to gather text information from many sources and structure it into events which are constructed online automatically and updated temporally. There are always diversified versions to describe an event and users usually are eager to know all the versions. With the huge quantity of documents, it is almost impossible for users to read all of them. In this paper, we formally define the problem of event diversified versions discovery. We introduce a. novel and principled model (called DVD) for discovering diversified versions for events. Unlike traditional clustering methods, we apply an iterative algorithm On a bipartite graph integrating co-occurrence and semantics to select the popular words and filter them to reduce the tight correlation between documents in a specific event. Hybrid link structures between words are utilized to find the hierarchical relationships. We employ a web communities discovery algorithm to construct virtual-documents which consist of a bag of words indicating one of the diversified versions. Under Rocchio Classification framework, we can classify the documents to diversified versions. With our novel evaluation method, empirical experiments on two real datasets show that DVD is effective and outperforms various related algorithms, including classic K-means and LDA.; Computer Science, Information Systems; Computer Science, Software Engineering; Computer Science, Theory & Methods; EI; CPCI-S(ISTP); 0
语种英语
DOI标识10.1007/978-3-642-20291-9_18
内容类型其他
源URL[http://ir.pku.edu.cn/handle/20.500.11897/406122]  
专题信息科学技术学院
推荐引用方式
GB/T 7714
Kong, Liang,Yan, Rui,He, Yijun,et al. DVD: A Model for Event Diversified Versions Discovery. 2011-01-01.
个性服务
查看访问统计
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。


©版权所有 ©2017 CSpace - Powered by CSpace