JAIST Repository: Effect of articulatory and acoustic features on the intelligibility of speech in noise: an articulatory synthesis study

トップページ| 北陸先端科学技術大学院大学| 附属図書館

一覧

コミュニティ
& コレクション
タイトル
著者
日付
学位論文
リサーチレポート・テクニカルメモランダム

登録利用者:

登録者ページ
利用者(E-people)

当システムについて

JAIST Repository >
b. 情報科学研究科・情報科学系 >
b10. 学術雑誌論文等 >
b10-1. 雑誌掲載論文 >

このアイテムの引用には次の識別子を使用してください: https://hdl.handle.net/10119/18020

タイトル:	Effect of articulatory and acoustic features on the intelligibility of speech in noise: an articulatory synthesis study
著者:	Ngo, Thuanvan Akagi, Masato Birkholz, Peter
キーワード:	Lombard speech Speech intelligibility Articulatory study
発行日:	2020-01-22
出版者:	Elsevier
誌名:	Speech Communication
巻:	117
開始ページ:	13
終了ページ:	20
DOI:	10.1016/j.specom.2020.01.004
抄録:	In noisy conditions, speakers involuntarily change their manner of speaking to enhance the intelligibility of their voices. The increased intelligibility of this so-called Lombard speech is enabled by the change of multiple articulatory and acoustic features. While the major features of Lombard speech are well known from previous studies, little is known about their relative contributions to the intelligibility of speech in noise. This study used an analysis-by-synthesis strategy to explore the contributions of multiple of these features. To this end, an articulatory speech synthesizer was used to synthesize the ten German digit words “Null” to “Neun”, for all 16 combinations of four binary features, i.e., modal vs. pressed phonation, normal vs. increased F_1 and F_2 formant frequencies, normal vs. increased f_0 mean and range, and normal vs. increased duration of vowels. Subjects were asked to try to recognize the synthesized words in the presence of strong pink noise and babble noise. Compared to “plain” speech, the word recognition rate was most improved by pressed phonation, followed by an increased f_0 mean and f_0 range, and increased formant frequencies. Increased duration of vowels slightly reduced the recognition rate for pink noise but had no effect for babble noise.
Rights:	Copyright (C)2020, Elsevier. Licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International license (CC BY-NC-ND 4.0). [http://creativecommons.org/licenses/by-nc-nd/4.0/] NOTICE: This is the author’s version of a work accepted for publication by Elsevier. Changes resulting from the publishing process, including peer review, editing, corrections, structural formatting and other quality control mechanisms, may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in Thuanvan Ngo, Masato Akagi, and Peter Birkholz, Speech Communication, 117, 2020, 13-20, http://dx.doi.org/10.1016/j.specom.2020.01.004
URI:	https://hdl.handle.net/10119/18020
資料タイプ:	author
出現コレクション:	b10-1. 雑誌掲載論文 (Journal Articles)

このアイテムのファイル:

ファイル	記述	サイズ	形式
3066.pdf		332Kb	Adobe PDF	見る/開く

当システムに保管されているアイテムはすべて著作権により保護されています。

お問合せ先 : 北陸先端科学技術大学院大学　研究推進課学術情報係 (ir-sys[at]ml.jaist.ac.jp)