JAIST Repository >
School of Information Science >
Articles >
Journal Articles >

Please use this identifier to cite or link to this item: http://hdl.handle.net/10119/10825

Title: A study on restoration of bone-conducted speech in noisy environments with LP-based model and Gaussian mixture model
Authors: Phung, Nghia Trung
Unoki, Masashi
Akagi, Masato
Keywords: bone-conducted speech
Gaussian mixture model
linear prediction
speech intelligibility
Issue Date: 2012-09
Publisher: 信号処理学会
Magazine name: Journal of Signal Processing
Volume: 16
Number: 5
Start page: 409
End page: 417
Abstract: The restoration of bone-conducted speech is a very important issue that enables robust speech communication in extremely noisy environments. We proposed a method of blind restoration in our previous studies based on a scheme of linear prediction with a method of training and prediction based on the simple recurrent neural network. However, prediction based on neural networks is not suitable for training with large corpora, which is necessary for real applications. The over-training problem with simple recurrent neural networks makes it difficult to train various kinds of bone-conducted speech in one session. In addition, it is difficult to adapt the neural network model to bone-conducted speech in unknown noisy environments to build an open dataset restoration of bone-conducted speech. Thus, a method of training and prediction based on the Gaussian mixture model was used in this research, instead of a neural network. A method of re-estimating the residual ratio in the scheme of linear prediction is also proposed. We also investigated how the proposed method works to restore bone-conducted speech in extremely noisy environments. Objective and subjective evaluations were carried out to evaluate the improvements in sound quality and the intelligibility of restored speech. The results revealed that our proposed method outperformed previous methods in both human hearing and automatic speech recognition systems even in extremely noisy environments.
Rights: Copyright (C) 2012 信号処理学会. Phung Nghia Trung, Masashi Unoki and Masato Akagi, Journal of Signal Processing, 16(5), 2012, 409-417.
URI: http://hdl.handle.net/10119/10825
Material Type: publisher
Appears in Collections:b10-1. 雑誌掲載論文 (Journal Articles)

Files in This Item:

File Description SizeFormat
1237.pdf2231KbAdobe PDFView/Open

All items in DSpace are protected by copyright, with all rights reserved.

 


Contact : Library Information Section, Japan Advanced Institute of Science and Technology