题名通用的面向任务的汉语口语对话系统研究
作者于水源
学位类别博士
答辩日期2003
授予单位中国科学院声学研究所
授予地点中国科学院声学研究所
其他题名General Task-Oriented Chinese Spoken Dialogue System
中文摘要面向任务的自然语言人机对话系统,是人与之通过自然语言交换信息达到特定目的计算机系统,是当前国际上人机对话技术研究最重要和最活跃的领域之就目前国际上的研究水平看,针对具体的应用建立一个特定的对话系统,在技术上总是可行的,主要的问题在于如何建立模型,如何编写对话理解、对话管理和语言生成三个主要算法。对不同的应用,例如电路修理[Smith1994]和订票任务【王显芳,2002],不能进行平滑地移植。针对平滑地移植的问题,己有的一些研究【Jokinenetal,2002][Turunenetal,2001」等,将对话系统架构成为一个基于代理的管理系统,目的在于尽可能使系统程序和任务数据相分离,减少系统移植的代价。但是,这种方法由于没有统一的数据表达结构,所以给系统实现带来很大困难。我们实验室的框架目标是研究建立系统程序和任务数据相分离的通用对话系统,该系统可以在挂接一个只包含任务数据的配置文件后,实现新的面向任务的对话。本文主要研究建立一种能够高度抽象概括和准确表达各种应用领域的言语行为、面向任务的口语对话系统的通用形式化描述体系,在这种通用形式化描述体系的基础上,开发一个通用的面向口语对话系统的语言生成器。在言语的表达研究方面,Austin提出的、后经过Searle加以修正和完善的言语行为理论认为,人们说话的目的不止是说出话语而是做事,人们在,言行事。人类交际的基本单位不是句子或其他任何表达手段,而是一定的行为。比如:陈述、请求、命令、提问、道歉、祝贺等行为。我们认为,上述言语行为的分类是对语言的语用功能的一种高度抽象概括,但是,这种分类还不足以准确描述人与计算机之间的对话,尤其是缺乏对人和计算机的话语的语义内容的刻画。作为人机自然语言对话的一种"系统协议",它必须同时包含话语的语用和语义两大要素。为此,我们提出了自然语言对话的CSL形式化描述体系。首先,按照Searle分类,把言语行为分为五个大的语用类别,在每个类别中,又根据具体的目的和语用力度等分为子类。我们称前者为C(Class)表达式,后者为S(Subclass)表达式。然后,考虑到话语的语义内容之间存在各种逻辑关系,我们引入逻辑表达式来表达话语的语义内容。它主要由谓词逻辑表达式构成。为了进一步表达自然语言中语义的不确定性、时间性等一系列复杂含义,我们还引入了疑问逻辑、时间逻辑和命令逻辑等表达式。我们统称这部分为L(Logic)表达式。言语行为的语用分类和话语语义的逻辑表达作为一个整体,构成自然语言对话的CSL形式化描述体系。其中,C和S两个表达式表达了一段话语的语用目的、言语行为带来的后果和心理状态,L表达式表达了话语的命题内容。在人机自然语言对话系统中,CSL形式化描述体系既可以作为用户话语的内部表达式,也可以作为生成系统话语的符号表达式,人与计算机的自然语言对话,从而抽象成为CSL表达式之间的映射。据我们所知,在对话系统中使用言语行为的研究己经有资料介绍,但是用于通用对话系统的研究和设计,特别是将言语行为描述和语义的逻辑描述有机地结合成为一个完整的语言对话形式化描述体系还是首次。对于CSL体系的描述能力,我们进行了真实数据的实验研究。1416条测试语料取自中科院声学所语音交互技术研究中心BEST对话系统的网上在线真实语料库[王显芳,2002]。经过人工标注后统计话语的CSL可表达性,实验结果为:用户话语标注率96.10%,系统话语标注率100%,总体可表达性98.32%。在语言的生成方面,我们采用CSL形式化描述体系,将一个CSL表达式转化为具有自然性、高效率的自然语言,提出了混合模板的语言生成方法。该方法的提出是根据弗雷格组合原则和朱德熙词组本位语法体系'的思想,以及现代汉语中关于短语、语序和句型的一些研究结果。混合模板的语言生成方法主要包括,短语模板和句子模板两大部分。短语模板是一个固定不变的结构,它具有一定的语义含义。句子模板是一个中性的典型语序抽象来的模板,它可以根据需要加以变换。在语言生成中,我们通过对CSL表达的句子模板的变换,实现了三种疑问句的生成、否定陈述句的生成、简答语句的生成、语句中代词的生成,还实现了目前国内外的文献中尚未见到的摹状词的生成。除此之外,由于实现了代词和简答语的生成,可以使对话系统的响应更加口语化。由于采用CSL形式化描述体系,达到了分离系统的程序代码和任务数据的目标,只要重新编写配置文件,就可以实现新的语言生成的任务。最后,我们实现了混合模板的语言生成系统,并对系统的生成结果进行了主观评测。实验内容包括两个生成任务,一个是BEST系统的车票及列车信息查询,另一个是模拟"非典"疫情通报。然后将生成的两个任务的句子混合在一起组成调查问卷,进行主观评测。实验结果为:不考虑语境情况下的单句的满意率是79.87%,在考虑语境情况下,满意率为77.01%,两项合计满意率为78.6%。
英文摘要The task-oriented natural language man-machine dialogue system is the computer system with which man exchanges information via natural language to achieve some specific purpose. It is one of the most important and active areas of the current researches on man-machine dialogue technology in the world. According to the current international research level, the establishment of a specific dialogue system pertinent to some concrete application is feasible technologically, and the major issue lies in how to set up models and how to write the three algorithms of dialogue understanding, dialogue management and language generation. For different applications, such as circuit fixing[Smith 1994] and ticketing[Xianfang Wang, 2002], we cannot conduct smooth naturalization. There are already some researches [Jokinen et al, 2002] [Turunen et al,2001] on the issue of smooth naturalization. The purpose of establishing the dialogue system as an agent-based management system is to separate the system program from the task data and reduce the cost of system naturalization. However, since there is no uniform data expression structure for this method, it is difficult to achieve system realization. The frame objective of our lab is to research on establishment of the general dialogue system in which the system program and task data are separated. This system, with a configuration file containing just the tast data, can realize task-oriented dialogue. In this dissertation we mainly research on establishment of a general formalized descriptive system, which can highly abstractly generalize and accurately express the. speech acts in various application fields, for the task-oriented spoken dialogue system, and development of a general language generator for the task-oriented dialogue system on the basis of this general formalized descriptive system. On speech expression research, the speech act theory that Austin put forward and Searle modified and perfected believes that the purpose of people's speech is to not only deliver the speech but also do things, i.e., people are doing things with speeches. The fundamental units of huamn intercommunication are not sentences or any other way of expression, but some acts, such as stating, asking, commanding, questioning, apologizing, congratulating, etc. We believe that the classification of the above-mentioned speech acts is the highly abstractly generalization of the pragmatic function of language, but this classification is not sufficient for accurately describing the dialogue between man and machine, and particularly it is short of depiction of the semantic contents of the speeches between man and machine. As a kind of "system protocol" of natural language man-machine dialogue, it must contain two elements -pragmatics and semantics of speeches at the same time. Therefore, we put forward the C_S_L formalized descriptive system of natural language dialogue. Firstly, in line with Searle's classification, we divide speech acts into five pragmatic classes, and in each of the five pragmatic classes, we give sub-classes according to the concrete purposes and illocutionary forceo We call the former C (Class) expression, and the latter S (Subclass) expression. And then, taking into considerations the various logic relations between the semantic contents of speeches, we introduce the logic expression to express the semantic contents of speeches. It is mainly composed of predication logic expressions. To further express such as the uncertainties, timeliness, and a seires of complicated significations of the natural language, we also introduce question logic, time logic, command logic and other expressions. We call this part as L (Logic) expression. The pragmatic classification of speech act and the logic expression of speech semantics, as a whole, form the C_S_L formalized descriptive system of natural language dialogue. In this system, C and S expressions express the illocutionary point of a passage of speech, and the effect and psychological state brought along by the speech act, while L expression expresses the proposition content of the speech. In the man-machine natural language dialogue system, the C_S_L formalized descriptive system can be used as the internal expression of user speech, or as the symbol expression of generation of system speech, so that the man-machine natural language dialogue is abstracted into map between C_S_L expressions. As far as we know, there already exists information about research on application of speech act in dialogue system, however, this is the first research on and design about the application of speech act in general dialogue system, particularly the organically integration of speech act description and semantic logic description into a complete language dialogue formalized descriptive system. On the descriptive capability of the C_S_L system, we have conducted experimental research with real data. The 1416 items of testing language data are taken from the online real corpus of the BEST Dialogue System of the Speech Interaction Technology Research Center, CAS Acoustic Research Institute. As for the statistic C_S_L expressiveness of speech with manual mark, the results of the expriment are as follow: user speech expressiveness 96.10%; system speech expressiveness 100%, and total expressiveness 98.32%. On language generation, we adopt the C_S_L formalized descriptive system to map a C_S_L expression into natural language with naturality and high efficiency, and put forward the language generation method of hybrid template.
语种中文
公开日期2011-05-07
页码102
内容类型学位论文
源URL[http://159.226.59.140/handle/311008/1034]  
专题声学研究所_声学所博硕士学位论文_1981-2009博硕士学位论文
推荐引用方式
GB/T 7714
于水源. 通用的面向任务的汉语口语对话系统研究[D]. 中国科学院声学研究所. 中国科学院声学研究所. 2003.
个性服务
查看访问统计
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。


©版权所有 ©2017 CSpace - Powered by CSpace