JAIST Repository: Adaptive β-order generalized spectral subtraction for speech enhancement

トップページ| 北陸先端科学技術大学院大学| 附属図書館

一覧

コミュニティ
& コレクション
タイトル
著者
日付
学位論文
リサーチレポート・テクニカルメモランダム

登録利用者:

登録者ページ
利用者(E-people)

当システムについて

JAIST Repository >
b. 情報科学研究科・情報科学系 >
b10. 学術雑誌論文等 >
b10-1. 雑誌掲載論文 >

このアイテムの引用には次の識別子を使用してください: http://hdl.handle.net/10119/4902

タイトル:	Adaptive β-order generalized spectral subtraction for speech enhancement
著者:	Li, Junfeng Sakamoto, Shuichi Hongo, Satoshi Akagi, Masato Suzuki, Yoiti
発行日:	2008-11
出版者:	Elsevier
誌名:	Signal Processing
巻:	88
号:	11
開始ページ:	2764
終了ページ:	2776
DOI:	10.1016/j.sigpro.2008.06.005
抄録:	The performance degradation of speech communication systems in noisy environments inspired increasing research on speech enhancement and noise reduction. As a well-known single-channel noise reduction technique, spectral subtraction (SS) has widely been used for speech enhancement. However, the spectral order β set in SS is always fixed to some constants, resulting in performance limitation to a certain degree. In this paper, we first analyze the performance of the β-order generalized spectral subtraction (GSS) in terms of the gain function to highlight its dependence on the value of spectral order β. A data-driven optimization scheme is then introduced to quantitatively determine the change of β with the change of the input signal-to-noise ratio (SNR). Based on the analysis results and considering the non-uniform effect of real-world noise on speech signal, we propose an adaptive β-order GSS in which the spectral order β is adaptively updated according to the local SNR in each critical band frame by frame as in a sigmoid function. The performance of the proposed adaptive β-order GSS is finally evaluated objectively by segmental SNR (SEGSNR) and log-spectral distance (LSD), and subjectively by spectrograms and mean opinion score (MOS), using comprehensive experiments in various noise conditions. Experimental results show that the proposed algorithm yields an average SEGSNR increase of 2:99 dB and an average LSD reduction of 2:71 dB, which are much larger improvement than that obtained with the competing SS algorithms. The superiority of the proposed algorithm is also demonstrated by the highest MOS ratings obtained from the listening tests.
Rights:	NOTICE: This is the author's version of a work accepted for publication by Elsevier. Junfeng Li, Shuichi Sakamoto, Satoshi Hongo, Masato Akagi and Yoiti Suzuki, Signal Processing, 88(11), 2008, 2764-2776, http://dx.doi.org/10.1016/j.sigpro.2008.06.005
URI:	http://hdl.handle.net/10119/4902
資料タイプ:	author
出現コレクション:	b10-1. 雑誌掲載論文 (Journal Articles)

このアイテムのファイル:

ファイル	記述	サイズ	形式
SIGPRO_2008.pdf		3579Kb	Adobe PDF	見る/開く

当システムに保管されているアイテムはすべて著作権により保護されています。

お問合せ先 : 北陸先端科学技術大学院大学　研究推進課図書館情報係 (ir-sys[at]ml.jaist.ac.jp)