JAIST Repository: Segment-level Effects of Gender, Nationality and Emotion Information on Text-independent Speaker Verification

トップページ| 北陸先端科学技術大学院大学| 附属図書館

一覧

コミュニティ
& コレクション
タイトル
著者
日付
学位論文
リサーチレポート・テクニカルメモランダム

登録利用者:

登録者ページ
利用者(E-people)

当システムについて

JAIST Repository >
b. 情報科学研究科・情報科学系 >
b11. 会議発表論文・発表資料等 >
b11-1. 会議発表論文・発表資料 >

このアイテムの引用には次の識別子を使用してください: http://hdl.handle.net/10119/16962

タイトル:	Segment-level Effects of Gender, Nationality and Emotion Information on Text-independent Speaker Verification
著者:	Li, Kai Akagi, Masato Wu, Yibo Dang, and Jianwu
キーワード:	Multitask learning Domain adversarial training Speaker embedding Text-independent speaker verification
発行日:	2020-10
出版者:	International Speech Communication Association
誌名:	Proc. InterSpeech2020
開始ページ:	2987
終了ページ:	2991
DOI:	10.21437/Interspeech.2020-1700
抄録:	Speaker embeddings extracted from neural network (NN) achieve excellent performance on general speaker verification (SV) missions. Most current SV systems use only speaker labels. Therefore, the interaction between different types of domain information decrease the prediction accuracy of SV. To overcome this weakness and improve SV performance, four effective SV systems were proposed by using gender, nationality, and emotion information to add more constraints in the NN training stage. More specifically, multitask learning-based systems which including multitask gender (MTG), multitask nationality (MTN) and multitask gender and nationality (MTGN) were used to enhance gender and nationality information learning. Domain adversarial training-based system which including emotion domain adversarial training (EDAT) was used to suppress different emotions information learning. Experimental results indicate that encouraging gender and nationality information and suppressing emotion information learning improve the performance of SV. In the end, our proposed systems achieved 16.4 and 22.9% relative improvements in the equal error rate for MTL- and DAT-based systems, respectively.
Rights:	Copyright (C) 2020 International Speech Communication Association. Kai Li, Masato Akagi, Yibo Wu, and Jianwu Dang, Proc. InterSpeech2020, 2020, pp.2987-2991. http://dx.doi.org/10.21437/Interspeech.2020-1700
URI:	http://hdl.handle.net/10119/16962
資料タイプ:	publisher
出現コレクション:	b11-1. 会議発表論文・発表資料 (Conference Papers)

このアイテムのファイル:

ファイル	記述	サイズ	形式
3357.pdf		439Kb	Adobe PDF	見る/開く

当システムに保管されているアイテムはすべて著作権により保護されています。

お問合せ先 : 北陸先端科学技術大学院大学　研究推進課図書館情報係 (ir-sys[at]ml.jaist.ac.jp)