タイトル: Toward Improving Estimation Accuracy of Emotion Dimensions in Bilingual Scenario Based on Three-layered Model
著者: LI, Xingfeng
Akagi, Masato
キーワード: Emotion dimensions
Fuzzy inference system (FIS)
Three-layered model
Emotion recognition in speech
発行日: 2015-10-28
出版者: Institute of Electrical and Electronics Engineers (IEEE)
誌名: 2015 International Conference Oriental COCOSDA held jointly with 2015 Conference on Asian Spoken Language Research and Evaluation (O-COCOSDA/CASLRE)
開始ページ: 21
終了ページ: 26
DOI: 10.1109/ICSDA.2015.7357858
抄録: This paper proposes a newly revised three-layered model to improve emotion dimensions (valence, activation) estimation for bilingual scenario, using knowledge of commonalities and differences of human perception among multiple languages. Most of previous systems on speech emotion recognition only worked in each mono-language. However, to construct a generalized emotion recognition system which be able to detect emotions for multiple languages, acoustic features selection and feature normalization among languages remained a topic. In this study, correlated features with emotion dimensions are selected to construct proposed model. To imitate emotion perception across languages, a novel normalization method is addressed by extracting direction and distance from neutral to other emotion in emotion dimensional space. Results show that the proposed system yields mean absolute error reduction rate of 46% and 34% for Japanese and German language respectively over previous system. The proposed system attains estimation performance more comparable to human evaluation on bilingual case.
Rights: This is the author's version of the work. Copyright (C) 2015 IEEE. 2015 International Conference Oriental COCOSDA held jointly with 2015 Conference on Asian Spoken Language Research and Evaluation (O-COCOSDA/CASLRE), 2015, pp.21-26. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
