CORC  > 北京大学  > 信息科学技术学院
Towards a Global Schema for Web Entities
Yao, Conglei ; Yu, Yongjian ; Shou, Sicong ; Li, Xiaoming
2008
英文摘要Popular entities often have thousands of instances on the Web. In this paper, we focus on the case where they are presented in table-like format, namely appearing with their attribute names. It is observed that, on one hand, for the same entity, different web pages often incorporate different attributes; on the other, for the same attribute, different web pages often use different attribute names (labels). Therefore, it is imaginably difficult to produce a global attribute schema for all the web entities of a given entity type based on their web instances, although the global attribute schema is usually highly desired in web entity instances integration and web object extraction. To this end, we propose a novel framework of automatically learning a global attribute schema for all web entities of one specific entity type. Under this framework, an iterative instances extraction procedure is first employed to extract sufficient web entity instances to discover enough attribute labels. Next, based on the labels, entity instances, and related web pages, a maximum entropy-based schema discovery approach is adopted to learn the global attribute schema for the target entity type. Experimental results on the Chinese Web achieve weighted average Fscores of 0.7122 and 0.7123 on two global attribute schemas for person-type and movie-type web entities, respectively. These results show that our framework is general, efficient and effective.; EI; 0
语种英语
DOI标识10.1145/1367497.1367632
内容类型其他
源URL[http://ir.pku.edu.cn/handle/20.500.11897/294943]  
专题信息科学技术学院
推荐引用方式
GB/T 7714
Yao, Conglei,Yu, Yongjian,Shou, Sicong,et al. Towards a Global Schema for Web Entities. 2008-01-01.
个性服务
查看访问统计
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。


©版权所有 ©2017 CSpace - Powered by CSpace