JAIST Repository >
b. 情報科学研究科・情報科学系 >
b10. 学術雑誌論文等 >
b10-1. 雑誌掲載論文 >

このアイテムの引用には次の識別子を使用してください: http://hdl.handle.net/10119/18719

タイトル: Increasing speech intelligibility and naturalness in noise based on concepts of modulation spectrum and modulation transfer function
著者: Ngo, Thuanvan
Kubo, Rieko
Akagi, Masato
キーワード: Modulation transfer function
modulation spectrum
発行日: 2021-10-01
出版者: Elsevier
誌名: Speech Communication
巻: 135
開始ページ: 11
終了ページ: 24
DOI: 10.1016/j.specom.2021.09.004
抄録: This study focuses on identifying effective features for controlling speech to increase speech intelligibility under adverse conditions. Previous approaches either cancel noise throughout speech presentation or preprocess speech by controlling its intensity and/or spectra. Among them, a method based on modulation transfer function theory, inverting the environmental effects to anticipate attenuation of speech modulation spectrum, shows excellent potential due to its systematic and explicit derivation of intelligibility enhancement against environmental smears. However, strictly following the inverse modulation transfer function is dangerous and ineffcient as important speech features can be damaged, and it costs lots of energy to boost all smeared regions. This study takes a different approach: analyzing the relations of smeared modulation spectra by the environments for intelligibility to extract effective modifying features. First, we conduct listening tests for intelligibility in noise with different types of enhanced speech. Next, we extract acoustic and modulation frequency components in the smeared modulation spectra by noise showing high correlation with intelligibility scores. Finally, we examine the intelligibility benefits of modifying these components by performing listening tests. The results show that these components effectively increase intelligibility by at most 18%, which demonstrates that our concept is valid.
Rights: Copyright (C)2021, Elsevier. Licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International license (CC BY-NC-ND 4.0). [http://creativecommons.org/licenses/by-nc-nd/4.0/] NOTICE: This is the author's version of a work accepted for publication by Elsevier. Thuanvan Ngo, Rieko Kubo, Masato Akagi, Speech Communication 135, 2021, 11-24, https://doi.org/10.1016/j.specom.2021.09.004
URI: http://hdl.handle.net/10119/18719
資料タイプ: author
出現コレクション:b10-1. 雑誌掲載論文 (Journal Articles)


ファイル 記述 サイズ形式
M-AKAGI-I-1115.pdf3082KbAdobe PDF見る/開く



お問合せ先 : 北陸先端科学技術大学院大学 研究推進課図書館情報係 (ir-sys[at]ml.jaist.ac.jp)