|
JAIST Repository >
School of Information Science >
Articles >
Journal Articles >
Please use this identifier to cite or link to this item:
http://hdl.handle.net/10119/18464
|
Title: | Music Theory-inspired Acoustic Representation for Speech Emotion Recognition |
Authors: | Li, Xingfeng Shi, Xiaohan Hu, Desheng Li, Yongwei Zhang, Qingchen Wang, Zhengxia Unoki, Masashi Akagi, Masato |
Keywords: | Affective computing speech emotion recognition acoustic representation music theory and speech analysis |
Issue Date: | 2023-06-26 |
Publisher: | Institute of Electrical and Electronics Engineers (IEEE) |
Magazine name: | IEEE/ACM Transactions on Audio, Speech, and Language Processing |
Volume: | 31 |
Start page: | 2534 |
End page: | 2547 |
DOI: | 10.1109/TASLP.2023.3289312 |
Abstract: | This research presents a music theory-inspired acoustic representation (hereafter, MTAR) to address improved speech emotion recognition. The recognition of emotion in speech and music is developed in parallel, yet a relatively limited understanding of MTAR for interpreting speech emotions is involved. In the present study, we use music theory to study representative acoustics associated with emotion in speech from vocal emotion expressions and auditory emotion perception domains. In experiments assessing the role and effectiveness of the proposed representation in classifying discrete emotion categories and predicting continuous emotion dimensions, it shows promising performance compared with extensively used features for emotion recognition based on the spectrogram, Melspectrogram, Mel-frequency cepstral coefficients, VGGish, and the large baseline feature sets of the INTERSPEECH challenges. This proposal opens up a novel research avenue in developing a computational acoustic representation of speech emotion via music theory. |
Rights: | This is the author's version of the work. Copyright (C) 2023 IEEE. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 31, 2023, pp. 2534-2547. DOI: 10.1109/TASLP.2023.3289312. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. |
URI: | http://hdl.handle.net/10119/18464 |
Material Type: | author |
Appears in Collections: | b10-1. 雑誌掲載論文 (Journal Articles)
|
Files in This Item:
File |
Description |
Size | Format |
M-AKAGI-I-0710.pdf | | 6684Kb | Adobe PDF | View/Open |
|
All items in DSpace are protected by copyright, with all rights reserved.
|