Home
People
Events
Research
Publications
Contact
News
Audio
nnSpeech: Speaker-Guided Conditional Variational Autoencoder for Zero-Shot Multi-speaker text-to-speech
Multi-speaker text-to-speech (TTS) using a few adaption data is a challenge in practical applications. To address that, we propose a …
Botao Zhao
,
Xulong Zhang
,
Jianzong Wang
,
Ning Cheng
,
Jing Xiao
Cite
arXiv
IEEE
Adaptive Sparse and Monotonic Attention for Transformer-based Automatic Speech Recognition
The Transformer architecture model, based on self-attention and multi-head attention, has achieved remarkable success in offline …
Chendong Zhao
,
Jianzong Wang
,
Wenqi Wei
,
Xiaoyang Qu
,
Haoqian Wang
,
Jing Xiao
Cite
arXiv
IEEE
DT-SV: A Transformer-based Time-domain Approach for Speaker Verification
Speaker verification (SV) aims to determine whether the speaker’s identity of a test utterance is the same as the reference …
Nan Zhang
,
Jianzong Wang
,
Zhenhou Hong
,
Chendong Zhao
,
Xiaoyang Qu
,
Jing Xiao
Cite
arXiv
IEEE
Learning Invariant Representation and Risk Minimized for Unsupervised Accent Domain Adaptation
Unsupervised representation learning for speech audios attained impressive performances for speech recognition tasks, particularly when …
Chendong Zhao
,
Jianzong Wang
,
Xiaoyang Qu
,
Haoqian Wang
,
Jing Xiao
Cite
QSpeech: Low-Qubit Quantum Speech Application Toolkit
Quantum devices with low qubits are common in the Noisy Intermediate-Scale Quantum (NISQ) era. However, Quantum Neural Network (QNN) …
Zhenhou Hong
,
Jianzong Wang
,
Xiaoyang Qu
,
Chendong Zhao
,
Wei Tao
,
Jing Xiao
Cite
arXiv
IEEE
r-G2P: Evaluating and Enhancing Robustness of Grapheme to Phoneme Conversion by Controlled Noise Introducing and Contextual Information Incorporation
Click the Cite button above to demo the feature to enable visitors to import publication metadata into their reference management software.
Chendong Zhao
,
Jianzong Wang
,
Xiaoyang Qu
,
Haoqian Wang
,
Jing Xiao
Cite
Towards Speaker Age Estimation With Label Distribution Learning
Existing methods for speaker age estimation usually treat it as a multi-class classification or a regression problem. However, precise …
Shijing Si
,
Jianzong Wang
,
Junqing Peng
,
Jing Xiao
Cite
arXiv
IEEE
CycleGEAN: Cycle Generative Enhanced Adversarial Network for Voice Conversion
Cycle Generative Adversarial Network (CycleGAN) for voice conversion (VC) task only used discriminators to identify whether the input …
Xulong Zhang
,
Jianzong Wang
,
Ning Cheng
,
Edward Xiao
,
Jing Xiao
PDF
Cite
IEEE
Reconstructing Dual Learning for Neural Voice Conversion Using Relatively Few Samples
This paper introduces a dual learning system for neural voice conversion (DualVC) using relatively few samples based on the symmetry of …
Aolan Sun
,
Jianzong Wang
,
Ning Cheng
,
Methawee Tantrawenith
,
Zhiyong Wu
,
Helen Meng
,
Edward Xiao
,
Jing Xiao
Cite
IEEE
TGAVC: Improving Autoencoder Voice Conversion with Text-Guided and Adversarial Training
Non-parallel many-to-many voice conversion remains an interesting but challenging speech processing task. Recently, AutoVC, a …
Huaizhen Tang
,
Xulong Zhang
,
Jianzong Wang
,
Ning Cheng
,
Zhen Zeng
,
Edward Xiao
,
Jing Xiao
Cite
arXiv
IEEE
«
»
Cite
×