Home
People
Events
Research
Publications
Contact
News
1
Reconstructing Dual Learning for Neural Voice Conversion Using Relatively Few Samples
This paper introduces a dual learning system for neural voice conversion (DualVC) using relatively few samples based on the symmetry of …
Aolan Sun
,
Jianzong Wang
,
Ning Cheng
,
Methawee Tantrawenith
,
Zhiyong Wu
,
Helen Meng
,
Edward Xiao
,
Jing Xiao
Cite
IEEE
TGAVC: Improving Autoencoder Voice Conversion with Text-Guided and Adversarial Training
Non-parallel many-to-many voice conversion remains an interesting but challenging speech processing task. Recently, AutoVC, a …
Huaizhen Tang
,
Xulong Zhang
,
Jianzong Wang
,
Ning Cheng
,
Zhen Zeng
,
Edward Xiao
,
Jing Xiao
Cite
arXiv
IEEE
Dropout Regularization for Self-Supervised Learning of Transformer Encoder Speech Representation
Predicting the altered acoustic frames is an effective way of self-supervised learning for speech representation. However, it is …
Jian Luo
,
Jianzong Wang
,
Ning Cheng
,
Jing Xiao
PDF
Cite
arXiv
ISCA
Speech2Video: Cross-Modal Distillation for Speech to Video Generation
This paper investigates a novel task of talking face video generation solely from speeches. The speech-to-video generation technique …
Shijing Si
,
Jianzong Wang
,
Xiaoyang Qu
,
Ning Cheng
,
Wenqi Wei
,
Xinghua Zhu
,
Jing Xiao
PDF
Cite
arXiv
ISCA
Variational Information Bottleneck for Effective Low-Resource Audio Classification
Large-scale deep neural networks (DNNs) such as convolutional neural networks (CNNs) have achieved impressive performance in audio …
Shijing Si
,
Jianzong Wang
,
Huiming Sun
,
Jianhan Wu
,
Chuanyao Zhang
,
Xiaoyang Qu
,
Ning Cheng
,
Lei Chen
,
Jing Xiao
PDF
Cite
arXiv
ISCA
A Language Model Based Pseudo-Sample Deliberation for Semi-supervised Speech Recognition
End-to-end modeling requires tremendous amounts of transcribed speech to achieve an automatic speech recognition (ASR) model with high …
Cheng Yi
,
Jianzong Wang
,
Ning Cheng
,
Shiyu Zhou
,
Bo Xu
Cite
IEEE
CACnet: Cube Attentional CNN for Automatic Speech Recognition
End-to-end models have been widely used in Automatic Speech Recognition (ASR). Convolutional Neural Networks (CNNs) can effectively use …
Nan Zhang
,
Jianzong Wang
,
Wenqi Wei
,
Xiaoyang Qu
,
Ning Cheng
,
Jing Xiao
Cite
IEEE
Loss Prediction: End-to-End Active Learning Approach For Speech Recognition
End-to-end speech recognition systems usually require huge amounts of labeling resource, while annotating the speech data is …
Jian Luo
,
Jianzong Wang
,
Ning Cheng
,
Jing Xiao
Cite
arXiv
IEEE
Transfer Ability of Monolingual Wav2vec2.0 for Low-resource Speech Recognition
Recently, there are several domains that have their own feature extractors, such as ResNet, BERT, and GPT-x, which are widely used for …
Cheng Yi
,
Jianzong Wang
,
Ning Cheng
,
Shiyu Zhou
,
Bo Xu
Cite
IEEE
Cross-Language Transfer Learning and Domain Adaptation for End-to-End Automatic Speech Recognition
In this paper, we demonstrate the efficacy of transfer learning and continuous learning for various automatic speech recognition (ASR) …
Jian Luo
,
Jianzong Wang
,
Ning Cheng
,
Edward Xiao
,
Jing Xiao
,
Georg Kucsko
,
Patrick O’Neill
,
Jagadeesh Balam
,
Slyne Deng
,
Adriana Flores
,
Boris Ginsburg
,
Jocelyn Huang
,
Oleksii Kuchaiev
,
Vitaly Lavrukhin
,
Jason Li
Cite
IEEE
«
»
Cite
×