CORC  > 北京大学  > 信息科学技术学院
Mining maximal correlated member clusters in high dimensional database
Jiang, LZ ; Yang, DQ ; Tang, SW ; Ma, XL ; Zhang, DH
2006
英文摘要Mining high dimensional data is an urgent problem of great practical importance. Although some data mining models such as frequent patterns and clusters have been proven to be very successful for analyzing very large data sets, they have some limitations. Frequent patterns are inadequate to describe the quantitative correlations among nominal members. Traditional cluster models ignore distances of some pairs of members, so a pair of members in one big cluster may be far away. As a combination and complementary of both techniques, we propose the Maximal-Correlated-Member-Cluster (MCMC) model in this paper. The MCMC model is based on a statistical measure reflecting the relationship of nominal variables, and every pair of members in one cluster satisfy unified constraints. Moreover, in order to improve algorithm's efficiency, we introduce pruning techniques to reduce the search space. In the first phase, a Tri-correlation inequation is used to eliminate unrelated member pairs, and in the second phase, an Inverse-Order-Enumeration-Tree (IOET) method is designed to share common computations. Experiments over both synthetic datasets and real life datasets are performed to examine our algorithm's performance. The results show that our algorithm has much higher efficiency than the naive algorithm, and this model can discover meaningful correlated patterns in high dimensional database.; Computer Science, Artificial Intelligence; Computer Science, Information Systems; SCI(E); CPCI-S(ISTP); 1
语种英语
内容类型其他
源URL[http://ir.pku.edu.cn/handle/20.500.11897/292204]  
专题信息科学技术学院
推荐引用方式
GB/T 7714
Jiang, LZ,Yang, DQ,Tang, SW,et al. Mining maximal correlated member clusters in high dimensional database. 2006-01-01.
个性服务
查看访问统计
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。


©版权所有 ©2017 CSpace - Powered by CSpace