|
JAIST Repository >
School of Information Science >
Articles >
Journal Articles >
Please use this identifier to cite or link to this item:
http://hdl.handle.net/10119/18074
|
Title: | Acoustic and articulatory analysis and synthesis of shouted vowels |
Authors: | Xue, Yawen Marxen, Michael Akagi, Masato Birkholz, Peter |
Keywords: | Shouted speech articulatory analysis articulatory synthesis Magnetic Resonance Imaging |
Issue Date: | 2020-10-09 |
Publisher: | Elsevier |
Magazine name: | Computer Speech & Language |
Volume: | 66 |
Start page: | 101156 |
DOI: | 10.1016/j.csl.2020.101156 |
Abstract: | Acoustic and articulatory differences between spoken and shouted vowels were analyzed for two male and two female subjects by means of acoustic recordings and midsagittal magnetic resonance images of the vocal tract. In accordance with previous acoustic findings, the fundamental frequencies, intensities, and formant frequencies were all generally higher for shouted than for spoken vowels. The harmonics-to-noise ratios and H1-H2 measures were generally lower for shouted vowels than for spoken vowels. With regard to articulation, all subjects used an increased lip opening, an increased jaw opening, and a lower tongue position for shouted vowels. However, the changes of vertical larynx position, uvula elevation, and jaw protrusion between spoken and shouted vowels were inconsistent among subjects. Based on the analysis results, a perception experiment was conducted to examine how changes of fundamental frequency, subglottal pressure, vocal tract shape, and phonation type contribute to the perception of stimuli created by articulatory synthesis as being shouted. Here, fundamental frequency had the greatest effect, followed by vocal tract shape and lung pressure, with no measurable effect of phonation type. |
Rights: | Copyright (C)2020, Elsevier. Licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International license (CC BY-NC-ND 4.0). [http://creativecommons.org/licenses/by-nc-nd/4.0/] NOTICE: This is the author’s version of a work accepted for publication by Elsevier. Changes resulting from the publishing process, including peer review, editing, corrections, structural formatting and other quality control mechanisms, may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in Yawen Xue, Michael Marxen, Masato Akagi, Peter Birkholz, Computer Speech & Language, 66, 2020, 101156, https://doi.org/10.1016/j.csl.2020.101156 |
URI: | http://hdl.handle.net/10119/18074 |
Material Type: | author |
Appears in Collections: | b10-1. 雑誌掲載論文 (Journal Articles)
|
Files in This Item:
File |
Description |
Size | Format |
3335.pdf | | 1913Kb | Adobe PDF | View/Open |
|
All items in DSpace are protected by copyright, with all rights reserved.
|