Home
People
Events
Research
Publications
Contact
News
1
Speech Representation Disentanglement with Adversarial Mutual Information Learning for One-shot Voice Conversion
One-shot voice conversion (VC) with only a single target-speaker speech for reference has become a new research direction. Existing …
SiCheng Yang
,
Methawee Tantrawenith
,
Haolin Zhuang
,
Zhiyong Wu
,
Aolan Sun
,
Jianzong Wang
,
Ning Cheng
,
Huaizhen Tang
,
Xintao Zhao
,
Jie Wang
,
Helen Meng
PDF
Cite
arXiv
ISCA
DEMO
SpeechEQ: Speech Emotion Recognition based on Multi-scale Unified Datasets and Multitask Learning
Speech emotion recognition (SER) has many challenges, but one of the main challenges is that each framework does not have a unified …
Zuheng Kang
,
Junqing Peng
,
Jianzong Wang
,
Jing Xiao
PDF
Cite
arXiv
ISCA
Tiny-Sepformer: A Tiny Time-Domain Transformer Network For Speech Separation
Time-domain Transformer neural networks have proven their superiority in speech separation tasks. However, these models usually have a …
Jian Luo
,
Jianzong Wang
,
Ning Cheng
,
Edward Xiao
,
Xulong Zhang
,
Jing Xiao
PDF
Cite
arXiv
ISCA
Uncertainty Calibration for Deep Audio Classifiers
Although deep Neural Networks (DNNs) have achieved tremendous success in audio classification tasks, their uncertainty calibration are …
Tong Ye
,
Shijing Si
,
Jianzong Wang
,
Ning Cheng
,
Jing Xiao
PDF
Cite
ISCA
Investigation of Singing Voice Separation for Singing Voice Detection in Polyphonic Music
Singing voice detection (SVD), to recognize vocal parts in the song, is an essential task in music information retrieval (MIR). The …
Yifu Sun
,
Xulong Zhang
,
Xi Chen
,
Yi Yu
,
Wei Li
Cite
arXiv
Springer
Adaptive Activation Network for Low Resource Multilingual Speech Recognition
Low resource automatic speech recognition (ASR) is a useful but thorny task, since deep learning ASR models usually need huge amounts …
Jian Luo
,
Jianzong Wang
,
Ning Cheng
,
Zhenpeng Zheng
,
Jing Xiao
Cite
arXiv
IEEE
MDCNN-SID: Multi-scale Dilated Convolution Network for Singer Identification
Most singer identification methods are processed in the frequency domain, which potentially leads to information loss during the …
Xulong Zhang
,
Jianzong Wang
,
Ning Cheng
,
Jing Xiao
Cite
arXiv
IEEE
MetaSID: Singer Identification with Domain Adaptation for Metaverse
Metaverse has stretched the real world into unlimited space. There will be more live concerts in Metaverse. The task of singer …
Xulong Zhang
,
Jianzong Wang
,
Ning Cheng
,
Jing Xiao
Cite
arXiv
IEEE
Singer Identification for Metaverse with Timbral and Middle-Level Perceptual Features
Metaverse is an interactive world that combines reality and virtuality, where participants can be virtual avatars. Anyone can hold a …
Xulong Zhang
,
Jianzong Wang
,
Ning Cheng
,
Jing Xiao
Cite
arXiv
IEEE
Speech Augmentation Based Unsupervised Learning for Keyword Spotting
In this paper, we investigated a speech augmentation based unsupervised learning approach for keyword spotting (KWS) task. KWS is a …
Jian Luo
,
Jianzong Wang
,
Ning Cheng
,
Haobin Tang
,
Jing Xiao
Cite
arXiv
IEEE
«
»
Cite
×