JAIST Repository >
School of Information Science >
Conference Papers >
Conference Papers >

Please use this identifier to cite or link to this item: http://hdl.handle.net/10119/15083

Title: Non-parallel training dictionary-based voice conversion with Variational Autoencoder
Authors: Vu, Ho-Tuan
Akagi, Masato
Issue Date: 2018-03-07
Publisher: Research Institute of Signal Processing, Japan
Magazine name: 2018 RISP International Workshop on Nonlinear Circuits, Communications and Signal Processing (NCSP2018)
Start page: 695
End page: 698
Abstract: In this paper, we present a dictionary-based voice conversion (VC) approach that does not require parallel data or linguistic labeling for training process. Dictionary-based voice conversion is the class of methods aiming to decompose speech into separate factors for manipulation. Non-negative matrix factorization (NMF) is the most common method to decomposed input spectrum into a weighted linear combination of a set of bases (dictionary) and weights. However, the requirement for parallel training data in this method causes several problems: 1) limited practical usability when parallel data are not available, 2) additional error from alignment process degrades out-put speech quality. In order to alleviate these problems, this paper presents a dictionary-based VC approach by incorporating a Variational Autoencoder (VAE) to decomposed input speech spectrum into speaker dictionary and weights without parallel training data. According to evaluation results, the proposed method achieved better speech naturalness while retaining the same speaker similarity as NMF-based VC even though un-aligned data is used.
Rights: Copyright (C) 2018 Research Institute of Signal Processing, Japan. Ho-Tuan Vu and Masato Akagi, 2018 RISP International Workshop on Nonlinear Circuits, Communications and Signal Processing (NCSP2018), 2018, 695-698.
URI: http://hdl.handle.net/10119/15083
Material Type: publisher
Appears in Collections:b11-1. 会議発表論文・発表資料 (Conference Papers)

Files in This Item:

File Description SizeFormat
2754.pdf999KbAdobe PDFView/Open

All items in DSpace are protected by copyright, with all rights reserved.


Contact : Library Information Section, Japan Advanced Institute of Science and Technology