JAIST Repository >
School of Information Science >
Conference Papers >
Conference Papers >

Please use this identifier to cite or link to this item: http://hdl.handle.net/10119/18194

Title: Hierarchical Prosody Analysis Improves Categorical and Dimensional Emotion Recognition
Authors: Li, Xingfeng
Guo, Taiyang
Hu, Xinhui
Xu, Xinkang
Dang, Jianwu
Akagi, Masato
Issue Date: 2021-12
Publisher: APSIPA
Magazine name: Proceedings, APSIPA Annual Summit and Conference 2021
Start page: 700
End page: 704
Abstract: Extracting reliable speech features is one of the most fundamental difficulties in emotion recognition systems. The extraction of spectral features has drawn much research attention but the extraction of prosody features, studying emotional cues, was often done by calculating statistics at an utterance level. However, the detailed prosody of different linguistic units can contain a large amount of emotion-related information. In this paper, we propose a novel hierarchical prosody analysis strategy by wavelet decomposition that models multi-level emotion transition phenomena. Our approach was evaluated on the IEMOCAP corpus and performed the best compared with state-of-the-art alternatives for both categorical and dimensional emotion recognition tasks, enabling the advancement of capturing dynamics in emotion expressions.
Rights: Copyright (C) 2021 APSIPA. This material is posted here with permission of APSIPA (Asia-Pacific Signal and Information Processing Association). Xingfeng Li, Taiyang Guo, Xinhui Hu, Xinkang Xu, Jianwu Dang; Masato Akagi, Proceedings of APSIPA Annual Summit and Conference 2021,pp.700-704
URI: http://hdl.handle.net/10119/18194
Material Type: publisher
Appears in Collections:b11-1. 会議発表論文・発表資料 (Conference Papers)

Files in This Item:

File Description SizeFormat
APSIPA0000700.pdf597KbAdobe PDFView/Open

All items in DSpace are protected by copyright, with all rights reserved.


Contact : Library Information Section, Japan Advanced Institute of Science and Technology