JAIST Repository >
b. 情報科学研究科・情報科学系 >
b11. 会議発表論文・発表資料等 >
b11-1. 会議発表論文・発表資料 >

このアイテムの引用には次の識別子を使用してください: http://hdl.handle.net/10119/9982

タイトル: Efficient modeling of temporal structure of speech for applications in voice transformation
著者: Nguyen, Binh Phu
Akagi, Masato
キーワード: spectral modification
voice transformation
temporal decomposition
発行日: 2009-09-09
出版者: International Speech Communication Association
誌名: Proceedings of INTERSPEECH 2009
開始ページ: 1631
終了ページ: 1634
抄録: Aims of voice transformation are to change styles of given utterances. Most voice transformation methods process speech signals in a time-frequency domain. In the time domain, whenprocessing spectral information, conventional methods do not consider relations between neighboring frames. If unexpected modifications happen, there are discontinuities between frames,which lead to the degradation of the transformed speech quality. This paper proposes a new modeling of temporal structure of speech to ensure the smoothness of the transformed speech for improving the quality of transformed speech in the voice transformation. In our work, we propose an improvement of the temporal decomposition (TD) technique, which decomposes a speech signal into event targets and event functions, to modelthe temporal structure of speech. The TD is used to control the spectral dynamics and to ensure the smoothness of transformed speech. We investigate the TD in two applications, concatenative speech synthesis and spectral voice conversion. Experimental results confirm the effectiveness of TD in terms of improving the quality of the transformed speech.
Rights: Copyright (C) 2009 International Speech Communication Association. Binh Phu Nguyen, Masato Akagi, Proceedings of INTERSPEECH 2009, pp.1631-1634.
URI: http://hdl.handle.net/10119/9982
資料タイプ: publisher
出現コレクション:b11-1. 会議発表論文・発表資料 (Conference Papers)


ファイル 記述 サイズ形式
IS2009_Binh.pdf263KbAdobe PDF見る/開く



お問合せ先 : 北陸先端科学技術大学院大学 研究推進課図書館情報係 (ir-sys[at]ml.jaist.ac.jp)