JAIST Repository >
School of Information Science >
Conference Papers >
Conference Papers >
Please use this identifier to cite or link to this item:
http://hdl.handle.net/10119/18194
|
Title: | Hierarchical Prosody Analysis Improves Categorical and Dimensional Emotion Recognition |
Authors: | Li, Xingfeng Guo, Taiyang Hu, Xinhui Xu, Xinkang Dang, Jianwu Akagi, Masato |
Issue Date: | 2021-12 |
Publisher: | APSIPA |
Magazine name: | Proceedings, APSIPA Annual Summit and Conference 2021 |
Start page: | 700 |
End page: | 704 |
Abstract: | Extracting reliable speech features is one of the most fundamental difficulties in emotion recognition systems.
The extraction of spectral features has drawn much research attention but the extraction of prosody features, studying emotional cues, was often done by calculating statistics at an utterance level. However, the detailed prosody of different linguistic units can contain a large amount of emotion-related information. In this paper, we propose a novel hierarchical prosody analysis strategy by wavelet decomposition that models multi-level emotion transition phenomena. Our approach was evaluated on the IEMOCAP corpus and performed the best compared with state-of-the-art alternatives for both categorical and dimensional emotion recognition tasks, enabling the advancement of capturing dynamics in emotion expressions. |
Rights: | Copyright (C) 2021 APSIPA. This material is posted here with permission of APSIPA (Asia-Pacific Signal and Information Processing Association). Xingfeng Li, Taiyang Guo, Xinhui Hu, Xinkang Xu, Jianwu Dang; Masato Akagi, Proceedings of APSIPA Annual Summit and Conference 2021,pp.700-704 |
URI: | http://hdl.handle.net/10119/18194 |
Material Type: | publisher |
Appears in Collections: | b11-1. 会議発表論文・発表資料 (Conference Papers)
|
Files in This Item:
File |
Description |
Size | Format |
APSIPA0000700.pdf | | 597Kb | Adobe PDF | View/Open |
|
All items in DSpace are protected by copyright, with all rights reserved.
|