JAIST Repository >
b. 情報科学研究科・情報科学系 >
b30. リサーチレポート >
Research Report - School of Information Science : ISSN 0918-7553 >
IS-RR-2007 >

このアイテムの引用には次の識別子を使用してください: http://hdl.handle.net/10119/8416

タイトル: LP-based method of blind restoration to improve intelligibility of bone-conducted speech
著者: Thang, Tat Vu
Unoki, Masashi
Akagi, Masato
発行日: 2007-10-05
出版者: 北陸先端科学技術大学院大学情報科学研究科
誌名: Research report (School of Information Science, Japan Advanced Institute of Science and Technology)
巻: IS-RR-2007-011
開始ページ: 1
終了ページ: 10
抄録: Bone-conducted (BC) speech can be used instead of air-conducted (AC) speech in an extremely noisy environment. However, its intelligibility is degraded when transmitted through bone-conduction. Therefore, voice quality and the intelligibility of BC speech need to be blindly improved in actual communication through speech and this is a challenging new topic in the field of speech signal processing. We proposed a linear prediction (LP) based model to restore BC speech to improve voice quality in a previous study. While other methods such as Long-term Fourier transform need to use numerous AC speech parameters to restore BC speech, the model we proposed demonstrated the expressed ability of blindly restoring BC speech by predicting AC-LP coefficients from BC-LP coefficients. We improved the previous model by (1) extending long-term processing to frame-basis processing, (2) using line spectral frequency (LSF) coefficients on an LP representation, and (3) using a recurrent neural network for predicting parameters. We evaluated the improved model in comparison with others to find out whether it could adequately improve voice quality and the intelligibility of BC speech, using objective measures (i.e., LSD, MCD, and LCD) and carrying out a subjective measure — a Japanese-word intelligibility test (JWIT). The experimental results proved significant improvements to our newly proposed models (LSF and LSF-SRN). The LSF model demonstrated it had significant capabilities for improving BC speech, i.e., both voice quality and intelligibility of speech. Our proposed model, LSF-SRN, demonstrated an expressed capability for improving the intelligibility of BC speech even when using blind restoration.
URI: http://hdl.handle.net/10119/8416
資料タイプ: publisher
出現コレクション:IS-RR-2007

このアイテムのファイル:

ファイル 記述 サイズ形式
IS-RR-2007-011.pdf211KbAdobe PDF見る/開く

当システムに保管されているアイテムはすべて著作権により保護されています。

 


お問い合わせ先 : 北陸先端科学技術大学院大学 研究推進課図書館情報係