JAIST Repository >
b. 情報科学研究科・情報科学系 >
b11. 会議発表論文・発表資料等 >
b11-1. 会議発表論文・発表資料 >
このアイテムの引用には次の識別子を使用してください:
http://hdl.handle.net/10119/11563
|
タイトル: | Cross-lingual Speech Emotion Recognition System Based on a Three-Layer Model for Human Perception |
著者: | Elbarougy, Reda Akagi, Masato |
発行日: | 2013-10 |
出版者: | Institute of Electrical and Electronics Engineers (IEEE) |
誌名: | 2013 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA) |
開始ページ: | 1 |
終了ページ: | 10 |
DOI: | 10.1109/APSIPA.2013.6694137 |
抄録: | The purpose of this study is to investigate whether emotion dimensions valence, activation, and dominance can be estimated cross-lingually. Most of the previous studies for automatic speech emotion recognition were based on detecting the emotional state working on mono-language. However, in order to develop a generalized emotion recognition system, the performance of these systems must be analyzed in mono-language as well as cross-language. The ultimate goal of this study is to build a bilingual emotion recognition system that has the ability to estimate emotion dimensions from one language using a system trained using another language. In this study, we first propose a novel acoustic feature selection method based on a human perception model. The proposed model consists of three layers: emotion dimensions in the top layer, semantic primitives in the middle layer, and acoustic features in the bottom layer. The experimental results reveal that the proposed method is effective for selecting acoustic features representing emotion dimensions, working with two different databases, one in Japanese and the other in German. Finally, the common acoustic features between the two databases are used as the input to the cross-lingual emotion recognition system. Moreover, the proposed cross-lingual system based on the three-layer model performs just as well as the two separate mono-lingual systems for estimating emotion dimensions values. |
Rights: | This is the author's version of the work. Copyright (C) 2013 IEEE. 2013 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2013, 1-10. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. |
URI: | http://hdl.handle.net/10119/11563 |
資料タイプ: | author |
出現コレクション: | b11-1. 会議発表論文・発表資料 (Conference Papers)
|
このアイテムのファイル:
ファイル |
記述 |
サイズ | 形式 |
Cross-lingual_Elbarougy_final.pdf | | 403Kb | Adobe PDF | 見る/開く |
|
当システムに保管されているアイテムはすべて著作権により保護されています。
|