Home
People
Events
Research
Publications
Contact
News
Speech
Tiny-Sepformer: A Tiny Time-Domain Transformer Network For Speech Separation
Time-domain Transformer neural networks have proven their superiority in speech separation tasks. However, these models usually have a …
Jian Luo
,
Jianzong Wang
,
Ning Cheng
,
Edward Xiao
,
Xulong Zhang
,
Jing Xiao
PDF
Cite
arXiv
ISCA
Uncertainty Calibration for Deep Audio Classifiers
Although deep Neural Networks (DNNs) have achieved tremendous success in audio classification tasks, their uncertainty calibration are …
Tong Ye
,
Shijing Si
,
Jianzong Wang
,
Ning Cheng
,
Jing Xiao
PDF
Cite
ISCA
Speech Augmentation Based Unsupervised Learning for Keyword Spotting
In this paper, we investigated a speech augmentation based unsupervised learning approach for keyword spotting (KWS) task. KWS is a …
Jian Luo
,
Jianzong Wang
,
Ning Cheng
,
Haobin Tang
,
Jing Xiao
Cite
arXiv
IEEE
DT-SV: A Transformer-based Time-domain Approach for Speaker Verification
Speaker verification (SV) aims to determine whether the speaker’s identity of a test utterance is the same as the reference …
Nan Zhang
,
Jianzong Wang
,
Zhenhou Hong
,
Chendong Zhao
,
Xiaoyang Qu
,
Jing Xiao
Cite
arXiv
IEEE
QSpeech: Low-Qubit Quantum Speech Application Toolkit
Quantum devices with low qubits are common in the Noisy Intermediate-Scale Quantum (NISQ) era. However, Quantum Neural Network (QNN) …
Zhenhou Hong
,
Jianzong Wang
,
Xiaoyang Qu
,
Chendong Zhao
,
Wei Tao
,
Jing Xiao
Cite
arXiv
IEEE
Towards Speaker Age Estimation With Label Distribution Learning
Existing methods for speaker age estimation usually treat it as a multi-class classification or a regression problem. However, precise …
Shijing Si
,
Jianzong Wang
,
Junqing Peng
,
Jing Xiao
Cite
arXiv
IEEE
Dropout Regularization for Self-Supervised Learning of Transformer Encoder Speech Representation
Predicting the altered acoustic frames is an effective way of self-supervised learning for speech representation. However, it is …
Jian Luo
,
Jianzong Wang
,
Ning Cheng
,
Jing Xiao
PDF
Cite
arXiv
ISCA
Variational Information Bottleneck for Effective Low-Resource Audio Classification
Large-scale deep neural networks (DNNs) such as convolutional neural networks (CNNs) have achieved impressive performance in audio …
Shijing Si
,
Jianzong Wang
,
Huiming Sun
,
Jianhan Wu
,
Chuanyao Zhang
,
Xiaoyang Qu
,
Ning Cheng
,
Lei Chen
,
Jing Xiao
PDF
Cite
arXiv
ISCA
End-To-End Silent Speech Recognition with Acoustic Sensing
Silent speech interfaces (SSI) has been an exciting area of recent interest. In this paper, we present a non-invasive silent speech …
Jian Luo
,
Jianzong Wang
,
Ning Cheng
,
Guilin Jiang
,
Jing Xiao
Cite
arXiv
IEEE
Communication-Memory-Efficient Decentralized Learning For Audio Representation
Smartphones and wearable devices produce a wealth of audio data, which cannot be accumulated in a centralized repository for learning …
Leilai Li
,
Jianzong Wang
,
Xiaoyang Qu
,
Jing Xiao
Cite
IEEE
«
»
Cite
×