タイトル: Study on hearing impression of speaker identification focusing on dynamic features
著者: Izumida, Tsuyoshi
Akagi, Masato
発行日: 2012-03-05
出版者: 2012 International Workshop on Nonlinear Circuits, Communications and Signal Processing (NCSP'12)
誌名: 2012 International Workshop on Nonlinear Circuits, Communications and Signal Processing (NCSP'12)
開始ページ: 401
終了ページ: 404
抄録: In this study, relationships between speaker identification and amount of dynamic features were investigated focusing on hearing impression. A three-layered model was adopted to model the hearing impression. First, relationships between speaker identification (first layer) and hearing impression (second layer), and those between hearing impression and acoustic features (third layer) were constructed with top down strategy. The results show that, “brisk” is a major factor in hearing impression of speaker identification, and slope of fundamental frequency (F_0) and dynamic range of spectral slope were correlated with the degrees of “brisk.” Slope of F_0 and dynamic range of spectral slope were amount of dynamic features. Since slope of F_0 and dynamic range of spectral slope were correlated with the degrees of “brisk,” “brisk” is hearing impression of speaker identification, correlated with dynamic features. Next, influences on speaker identification in the first layer from varied acoustic features in the third layer were investigated from bottom to top. The results show that, varied acoustic features for “brisk” affected speaker identification. Thus, it revealed that amount of dynamic features affects speaker identification.
Rights: This material is posted here with permission of the Research Institute of Signal Processing Japan. Tsuyoshi Izumida and Masato Akagi, 2012 International Workshop on Nonlinear Circuits, Communications and Signal Processing (NCSP'12), 2012, pp.401-404.
URI: http://hdl.handle.net/10119/10820
