JAIST Repository >
b. 情報科学研究科・情報科学系 >
b11. 会議発表論文・発表資料等 >
b11-1. 会議発表論文・発表資料 >
このアイテムの引用には次の識別子を使用してください:
http://hdl.handle.net/10119/18769
|
タイトル: | Increasing Speech Intelligibility by Mimicking Professional Announcers’ Voices and Its Physical Correlates |
著者: | Tran, Dung Kim Akagi, Masato Unoki, Masashi |
キーワード: | spectral tilt spectral plateau cepstral peak prominence PCA STOI voice conversion |
発行日: | 2023-10-31 |
出版者: | Institute of Electrical and Electronics Engineers (IEEE) |
誌名: | 2023 Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) |
開始ページ: | 1187 |
終了ページ: | 1192 |
DOI: | 10.1109/APSIPAASC58517.2023.10317261 |
抄録: | Previous studies found that speech uttered by professional announcers is more intelligible than that by nonexperts in noisy environments. On the basis of this finding, we developed a voice-conversion (VC) system to mimic professional announcers’ voices by modifying the speaker embedding of nonexpert speech. The results from our experiments to evaluate this system indicated that intelligibility increased significantly with this system. In this paper, to discuss what physical features correlate to the intelligibility, the following two issues are investigated
by analyzing this system: (1) whether speech intelligibility can be changed gradually even by shifting one PCA (principal component analysis) component of the speaker embedding of the
above VC system and (2) what physical features are changed when the PCA component is shifted, we retrained the VC system with a larger amount of training data. Comparing the speech
intelligibility and candidate features that were changed with the shift of one axis of PCA, we found that spectral tilt, spectral plateau, and cepstral peak prominence are strongly correlated with intelligibility. |
Rights: | This is the author's version of the work. Copyright (C) 2023 IEEE. 2023 Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), Taipei, Taiwan, 2023, pp. 1187-1192, doi: 10.1109/APSIPAASC58517.2023.10317261. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. |
URI: | http://hdl.handle.net/10119/18769 |
資料タイプ: | author |
出現コレクション: | b11-1. 会議発表論文・発表資料 (Conference Papers)
|
このアイテムのファイル:
ファイル |
記述 |
サイズ | 形式 |
M-AKAGI-I-1109.pdf | | 252Kb | Adobe PDF | 見る/開く |
|
当システムに保管されているアイテムはすべて著作権により保護されています。
|