Home
People
Events
Research
Publications
Contact
News
Voice Conversion
EAD-VC: Enhancing Speech Auto-Disentanglement for Voice Conversion with IFUB Estimator and Joint Text-Guided Consistent Learning
Using unsupervised learning to disentangle speech into content, rhythm, pitch, and timbre for voice conversion has become a hot …
Ziqi Liang
,
Jianzong Wang
,
Xulong Zhang
,
Yong Zhang
,
Ning Cheng
,
Jing Xiao
Cite
arXiv
DEMO
IEEE
Learning Expressive Disentangled Speech Representations with Soft Speech Units and Adversarial Style Augmentation
Voice conversion is the task to transform voice characteristics of source speech while preserving content information. Nowadays, …
Yimin Deng
,
Jianzong Wang
,
Xulong Zhang
,
Ning Cheng
,
Jing Xiao
Cite
arXiv
IEEE
MAIN-VC: Lightweight Speech Representation Disentanglement for One-Shot Voice Conversion
One-shot voice conversion aims to change the timbre of any source speech to match that of the unseen target speaker with only one …
Pengcheng Li
,
Jianzong Wang
,
Xulong Zhang
,
Yong Zhang
,
Jing Xiao
,
Ning Cheng
Cite
arXiv
DEMO
IEEE
Learning Disentangled Speech Representations with Contrastive Learning and Time-Invariant Retrieval
Voice conversion refers to transferring speaker identity with well-preserved content. Better disentanglement of speech representations …
Yimin Deng
,
Huaizhen Tang
,
Xulong Zhang
,
Ning Cheng
,
Jing Xiao
,
Jianzong Wang
Cite
arXiv
IEEE
PMVC: Data Augmentation-Based Prosody Modeling for Expressive Voice Conversion
Voice conversion as the style transfer task applied to speech, refers to converting one person’s speech into a new speech that …
Yimin Deng
,
Huaizhen Tang
,
Xulong Zhang
,
Jianzong Wang
,
Ning Cheng
,
Jing Xiao
PDF
Cite
arXiv
DEMO
ACM
CLN-VC: Text-Free Voice Conversion Based on Fine-Grained Style Control and Contrastive Learning with Negative Samples Augmentation
Better disentanglement of speech representation is essential to improve the quality of voice conversion. Recently contrastive learning …
Yimin Deng
,
Xulong Zhang
,
Jianzong Wang
,
Ning Cheng
,
Jing Xiao
Cite
arXiv
IEEE
Voice Conversion with Denoising Diffusion Probabilistic GAN Models
Voice conversion is a method that allows for the transformation of speaking style while maintaining the integrity of linguistic …
Xulong Zhang
,
Jianzong Wang
,
Ning Cheng
,
Jing Xiao
Cite
arXiv
Springer
Learning Speech Representations with Flexible Hidden Feature Dimensions
Non-parallel many-to-many voice conversion is a kind of style transfer task in speech. Recently, AutoVC has been applied in this field …
Huaizhen Tang
,
Xulong Zhang
,
Jianzong Wang
,
Ning Cheng
,
Jing Xiao
PDF
Cite
IEEE
VQ-CL: Learning Disentangled Speech Representations with Contrastive Learning and Vector Quantization
Voice Conversion(VC) refers to converting the voice char- acteristics of audio to another one as it is said by other people. Recently, …
Huaizhen Tang
,
Xulong Zhang
,
Jianzong Wang
,
Ning Cheng
,
Jing Xiao
PDF
Cite
IEEE
Boosting Star-GANs for Voice Conversion with Contrastive Discriminator
Nonparallel multi-domain voice conversion methods such as the StarGAN-VCs have been widely applied in many scenarios. However, the …
Shijing Si
,
Jianzong Wang
,
Xulong Zhang
,
Xiaoyang Qu
,
Ning Cheng
,
Jing Xiao
Cite
arXiv
Springer
»
Cite
×