Home
People
Events
Research
Publications
Contact
News
1
CycleGEAN: Cycle Generative Enhanced Adversarial Network for Voice Conversion
Cycle Generative Adversarial Network (CycleGAN) for voice conversion (VC) task only used discriminators to identify whether the input …
Xulong Zhang
,
Jianzong Wang
,
Ning Cheng
,
Edward Xiao
,
Jing Xiao
PDF
Cite
IEEE
Reconstructing Dual Learning for Neural Voice Conversion Using Relatively Few Samples
This paper introduces a dual learning system for neural voice conversion (DualVC) using relatively few samples based on the symmetry of …
Aolan Sun
,
Jianzong Wang
,
Ning Cheng
,
Methawee Tantrawenith
,
Zhiyong Wu
,
Helen Meng
,
Edward Xiao
,
Jing Xiao
Cite
IEEE
TGAVC: Improving Autoencoder Voice Conversion with Text-Guided and Adversarial Training
Non-parallel many-to-many voice conversion remains an interesting but challenging speech processing task. Recently, AutoVC, a …
Huaizhen Tang
,
Xulong Zhang
,
Jianzong Wang
,
Ning Cheng
,
Zhen Zeng
,
Edward Xiao
,
Jing Xiao
Cite
arXiv
IEEE
Dropout Regularization for Self-Supervised Learning of Transformer Encoder Speech Representation
Predicting the altered acoustic frames is an effective way of self-supervised learning for speech representation. However, it is …
Jian Luo
,
Jianzong Wang
,
Ning Cheng
,
Jing Xiao
PDF
Cite
arXiv
ISCA
Speech2Video: Cross-Modal Distillation for Speech to Video Generation
This paper investigates a novel task of talking face video generation solely from speeches. The speech-to-video generation technique …
Shijing Si
,
Jianzong Wang
,
Xiaoyang Qu
,
Ning Cheng
,
Wenqi Wei
,
Xinghua Zhu
,
Jing Xiao
PDF
Cite
arXiv
ISCA
Variational Information Bottleneck for Effective Low-Resource Audio Classification
Large-scale deep neural networks (DNNs) such as convolutional neural networks (CNNs) have achieved impressive performance in audio …
Shijing Si
,
Jianzong Wang
,
Huiming Sun
,
Jianhan Wu
,
Chuanyao Zhang
,
Xiaoyang Qu
,
Ning Cheng
,
Lei Chen
,
Jing Xiao
PDF
Cite
arXiv
ISCA
A Language Model Based Pseudo-Sample Deliberation for Semi-supervised Speech Recognition
End-to-end modeling requires tremendous amounts of transcribed speech to achieve an automatic speech recognition (ASR) model with high …
Cheng Yi
,
Jianzong Wang
,
Ning Cheng
,
Shiyu Zhou
,
Bo Xu
Cite
IEEE
CACnet: Cube Attentional CNN for Automatic Speech Recognition
End-to-end models have been widely used in Automatic Speech Recognition (ASR). Convolutional Neural Networks (CNNs) can effectively use …
Nan Zhang
,
Jianzong Wang
,
Wenqi Wei
,
Xiaoyang Qu
,
Ning Cheng
,
Jing Xiao
Cite
IEEE
Loss Prediction: End-to-End Active Learning Approach For Speech Recognition
End-to-end speech recognition systems usually require huge amounts of labeling resource, while annotating the speech data is …
Jian Luo
,
Jianzong Wang
,
Ning Cheng
,
Jing Xiao
Cite
arXiv
IEEE
Transfer Ability of Monolingual Wav2vec2.0 for Low-resource Speech Recognition
Recently, there are several domains that have their own feature extractors, such as ResNet, BERT, and GPT-x, which are widely used for …
Cheng Yi
,
Jianzong Wang
,
Ning Cheng
,
Shiyu Zhou
,
Bo Xu
Cite
IEEE
«
»
Cite
×