JAIST Repository: Limited error based event localizing temporal decomposition and its application to variable-rate speech coding

トップページ| 北陸先端科学技術大学院大学| 附属図書館

一覧

コミュニティ
& コレクション
タイトル
著者
日付
学位論文
リサーチレポート・テクニカルメモランダム

登録利用者:

登録者ページ
利用者(E-people)

当システムについて

JAIST Repository >
b. 情報科学研究科・情報科学系 >
b10. 学術雑誌論文等 >
b10-1. 雑誌掲載論文 >

このアイテムの引用には次の識別子を使用してください: http://hdl.handle.net/10119/4903

タイトル:	Limited error based event localizing temporal decomposition and its application to variable-rate speech coding
著者:	Nguyen, Phu Chien Akagi, Masato Nguyen, Binh Phu
キーワード:	Temporal decomposition Event vector Event function STRAIGHT Speech coding Line spectral frequency
発行日:	2007-04
出版者:	Elsevier
誌名:	Speech Communication
巻:	49
号:	4
開始ページ:	292
終了ページ:	304
DOI:	10.1016/j.specom.2007.02.007
抄録:	This paper proposes a novel algorithm for temporal decomposition (TD) of speech, called `Limited Error Based Event Localizing Temporal Decomposition' (LEBEL-TD), and its application to variable-rate speech coding. In previous work with TD, TD analysis was usually performed on each speech segment of about 200-300 ms or more, making it impractical for online applications. In this present work, the event localization is determined based on a limited error criterion and a local optimization strategy, which results in an average algorithmic delay of 65 ms. Simulation results show that an average log spectral distortion of about 1.5 dB can be achievable at an event rate of 20 events/sec. Also, LEBEL-TD uses neither the computationally costly singular value decomposition routine nor the event refinement process, thus reducing significantly the computational cost of TD. Further, a method for variable-rate speech coding an average rate of around 1.8 kbps based on STRAIGHT (Speech Transformation and Representation using Adaptive Interpolation of weiGHTed spectrum), which is a high-quality speech analysis-synthesis framework, using LEBEL-TD is also realized. Subjective test results indicate that the performance of the proposed speech coding method is comparable to that of the 4.8 kbps FS-1016 CELP coder.
Rights:	NOTICE: This is the author’s version of a work accepted for publication by Elsevier. Changes resulting from the publishing process, including peer review, editing, corrections, structural formatting and other quality control mechanisms, may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in Phu Chien Nguyen, Masato Akagi and Binh Phu Nguyen, Speech Communication, 49(4), 2007, 292-304, http://dx.doi.org/10.1016/j.specom.2007.02.007
URI:	http://hdl.handle.net/10119/4903
資料タイプ:	author
出現コレクション:	b10-1. 雑誌掲載論文 (Journal Articles)

このアイテムのファイル:

ファイル	記述	サイズ	形式
Nguyen_SPECOM-D-05-00140_Ver2.pdf		547Kb	Adobe PDF	見る/開く

当システムに保管されているアイテムはすべて著作権により保護されています。

お問合せ先 : 北陸先端科学技術大学院大学　研究推進課図書館情報係 (ir-sys[at]ml.jaist.ac.jp)