CORC  > 北京大学  > 信息科学技术学院
复杂布尔查询下的文档收集打分策略的优化; Optimization for Collecting and Scoring Documents for Complex Boolean Query
黄达 ; 闫宏飞
刊名计算机科学与探索
2017
关键词复杂布尔查询 查询优化 性能回归 complex Boolean query optimizing query performance regression
DOI10.3778/j.issn.1673-9418.1511044
英文摘要虽然布尔查询是信息检索领域中较早提出的一个概念,但是对布尔查询的大量研究主要还是针对布尔操作一致的布尔查询.对于复杂布尔查询,目前并没有太多的相关研究,复杂布尔查询却越来越被频繁地使用(如文本推荐领域).为了促使这类查询能够被更加高效地执行,提出了一种基于DAAT (document-at-a-time)框架的文档收集打分策略——DCQ(DAAT for complex query)算法,并与著名开源搜索引擎Lucene进行比较实验,查询性能有了显著提升.此外,提出了一套对查询性能的回归预测机制,该机制能比较准确地决策DCQ算法的使用时机.实验表明,结合了性能预测器的复合算法要远优于Lucene当前的文档收集打分算法.; Although Boolean query has been proposed very early in information retrieval,most research on Boolean query focuses on homogeneous Boolean operation.Few researchers paid attention to complex Boolean query,while such query is used more and more frequently,e.g.in text-based recommendation.In order to make complex Boolean query execute more efficiently,this paper proposes a new strategy,DCQ (DAAT for complex query) algorithm,which is based on DAAT (document-at-a-time) framework.By comparing DCQ algorithm with the well-known open-source search engine,Lucene,it shows a promising improvement on performance.Besides,this paper proposes a method for performance regression,which can decide when to use DCQ algorithm accurately.Experiments show that the compound algorithm with performance regression is much better than the algorithm for collecting and scoring documents used in Lucene.; The National Basic Research Program of China under Grant No.2014CB340400(国家重点基础研究发展计划; the National Natural Science Foundation of China under Grant Nos.61272340,61272340; 中国科学引文数据库(CSCD); 1; 106-113; 11
语种英语
内容类型期刊论文
源URL[http://ir.pku.edu.cn/handle/20.500.11897/477919]  
专题信息科学技术学院
推荐引用方式
GB/T 7714
黄达,闫宏飞. 复杂布尔查询下的文档收集打分策略的优化, Optimization for Collecting and Scoring Documents for Complex Boolean Query[J]. 计算机科学与探索,2017.
APA 黄达,&闫宏飞.(2017).复杂布尔查询下的文档收集打分策略的优化.计算机科学与探索.
MLA 黄达,et al."复杂布尔查询下的文档收集打分策略的优化".计算机科学与探索 (2017).
个性服务
查看访问统计
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。


©版权所有 ©2017 CSpace - Powered by CSpace