JAIST Repository >
School of Information Science >
Conference Papers >
Conference Papers >

Please use this identifier to cite or link to this item: http://hdl.handle.net/10119/16660

Title: Dimensional Emotion Recognition from Speech Using Modulation Spectral Features and Recurrent Neural Network
Authors: Peng, Zhichao
Zhu, Zhi
Unoki, Masashi
Dang, Jianwu
Akagi, Masato
Issue Date: 2019-11-19
Publisher: Institute of Electrical and Electronics Engineers (IEEE)
Magazine name: 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)
Start page: 524
End page: 528
DOI: 10.1109/APSIPAASC47483.2019.9023067
Abstract: Dimensional emotion recognition (DER) from speech is used to track the dynamics of emotions for robots to naturally interact with humans. The DER system needs to obtain frame-level feature sequences by selecting the appropriate acoustic features and duration. Moreover, these sequences should reflect the dynamic characteristics of the utterance. Temporal modulation cues are good at capturing the dynamic characteristics for speech perception and understanding. In this paper, we propose a DER system using modulation spectral features (MSFs) and recurrent neural networks (RNNs). The MSFs are obtained from temporal modulation cues, which are produced from auditory front-ends by auditory filtering of speech signals and modulation filtering of the temporal envelope in a cascade manner. Then, the MSFs are fed into RNNs to capture the dynamic change of emotions from the sequences. Our experiments of predicting valence and arousal involving the RECOLA database demonstrated that the proposed system significantly outperforms the baseline systems, improving arousal predictions by 17% and valence predictions by 29.5%.
Rights: This is the author's version of the work. Copyright (C) 2019 IEEE. 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 2019, pp.524-528. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
URI: http://hdl.handle.net/10119/16660
Material Type: author
Appears in Collections:b11-1. 会議発表論文・発表資料 (Conference Papers)

Files in This Item:

File Description SizeFormat
APSIPA_2019_524.pdf450KbAdobe PDFView/Open

All items in DSpace are protected by copyright, with all rights reserved.


Contact : Library Information Section, Japan Advanced Institute of Science and Technology