题名基于麦克风阵列的盲语音分离算法研究
作者施剑
学位类别博士
答辩日期2005
授予单位中国科学院声学研究所
授予地点中国科学院声学研究所
关键词盲语音分离 信息最大化 独立分量分析 延迟-相加波束形成 麦克风阵列
其他题名Research on Algorithm of Blind Speech Separation Based on Microphone Array
中文摘要盲源分离是近几年信号处理领域兴起的热点问题,其主要目的是在未知源信号以及传输特性的前提下,仅从接收的混合信号中提取或表达出独立源信号的信息。近几年的研究表明,真实环境中的盲语音分离是非常困难的,环境噪声、复杂的房间冲激响应都会使仿真环境下工作很好的盲分离算法性能大大下降。本论文在此背景下开展基于麦克风阵列的盲语音分离算法研究。本论文的主要工作如下:1.针对卷积混合的盲分离模型,本文提出了一种快速的频域内盲语音分离方法,通过预处理(如解相关),基于信息最大化的独立分量分析滤波和后处理(解排列和尺度不定性)等一系列算法步骤,达到有效地分离卷积混迭的语音信号的目的;尤其是引入一种改进的数据白化算法,能够很好地去除各分量中的二阶相关,提高了独立分量分析算法的收敛速度,对低混响条件(混响时间小于100ms)下的真实混合语音信号,获得非常好的分离效果;2.针刘"在真实会议厅(混响时间在800ms左右)频域盲语音分离算法性能下降的问题,本文提出了一种改进的实时盲语音分离方法,通过延时-相加的多波束形成器和功率谱减法的预处理算法,对混响和噪声有一定的抑制作用,以及采用改进的批处理算法,因此它非常适合于在实际的麦克风阵列盲语音分离系统中实时实现;3.设计并实现了一种基于USBZ.O接口的麦克风阵列实时采集系统,通过FPGA实时地采集并打包多通道的语音数据,由DMA通道交给USB2.0控f卜l芯片·再由USB2.0的等时传输(isochronous)端点把语音数据传送给上位PC机,实时同步获取信噪比高的麦克风阵列语音数据,完成了实时盲语音分离算法的性能评估实验。本系统的uSB2.0带宽最大达到了192Mbjt/s(在等时传输模式下)的理论值,提供了很好的系统扩展性。
英文摘要Blind Source Separation (BSS) is a recently developed signal processing technique whose goal is to estimate and realize a set of statistically independent component variables from their combinations. There are many potential exciting applications of blind source separation in science and technology, especially in audio processing, wireless communication, medical diagnosis, image enhancement and radar signal processing. Recent study result show that Blind Speech Separation is too difficult in real-world, Background noise, reverberation effects will degrade the performance of BSS algorithms which well perform under simulation environment. Based on these, this thesis studied the algorithms of Blind Speech Separation based on microphone array. The main contributions of this thesis are: 1. A fast frequency domain blind speech separation algorithm is presented for convolutive speech mixtures. It is a serial algorithm of preprocessing(including whitening and decorelating) and frequency domain ICA filter based on infomax and postprocessing (including solving the permutation and scaling ambiguity), and could solve the convolutive speech mixtures very well. The improved whitening algorithm could reduce second-order collection, and speed up the convergence of frequency domain ICA. We can get well performance of BSS in low reverberant room (Reverberation Time is less 100ms) using the algorithm. A improved real-time blind speech separation is presented for performance degradation of frequency domain BSS in auditoria (Reverberation Time is about 800ms). It uses delay-and-sum beamformer and spectrum subtraction as its pretreatment, and uses an improved batch algorithm to satisfy the real-time requirement. Due to its low computational complexity and robustness to reverberant and noise environments, this new approach is apt to be implemented in the Blind Speech Separation system based on microphone array. 3. A USB2.0-based microphone array speech data acquisition system is realized. In our system, FPGA is in charge of collecting and packaging multi-channel speech data real time, and sends the data to USB2.0 chipset through DMA (in USB chipset). Then USB2.0 chipset sends the data to PC in isochronous mode. Our system can fetch synchronized, real-time, high SNR microphone array speech data, and well satisfy the Blind Speech Separation Research. The USB2.0 bandwidth of our system is 192Mbit/s, and offer good system expansibility.
语种中文
公开日期2011-05-07
页码102
内容类型学位论文
源URL[http://159.226.59.140/handle/311008/1056]  
专题声学研究所_声学所博硕士学位论文_1981-2009博硕士学位论文
推荐引用方式
GB/T 7714
施剑. 基于麦克风阵列的盲语音分离算法研究[D]. 中国科学院声学研究所. 中国科学院声学研究所. 2005.
个性服务
查看访问统计
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。


©版权所有 ©2017 CSpace - Powered by CSpace