Home
People
Events
Research
Publications
Contact
News
1
Transfer Ability of Monolingual Wav2vec2.0 for Low-resource Speech Recognition
Recently, there are several domains that have their own feature extractors, such as ResNet, BERT, and GPT-x, which are widely used for …
Cheng Yi
,
Jianzong Wang
,
Ning Cheng
,
Shiyu Zhou
,
Bo Xu
Cite
IEEE
Cross-Language Transfer Learning and Domain Adaptation for End-to-End Automatic Speech Recognition
In this paper, we demonstrate the efficacy of transfer learning and continuous learning for various automatic speech recognition (ASR) …
Jian Luo
,
Jianzong Wang
,
Ning Cheng
,
Edward Xiao
,
Jing Xiao
,
Georg Kucsko
,
Patrick O’Neill
,
Jagadeesh Balam
,
Slyne Deng
,
Adriana Flores
,
Boris Ginsburg
,
Jocelyn Huang
,
Oleksii Kuchaiev
,
Vitaly Lavrukhin
,
Jason Li
Cite
IEEE
图神经网络综述
Jianzong Wang
,
Lingwei Kong
,
Zhangcheng Huang
,
Jing Xiao
Last updated on Aug 21, 2025
Cite
Link
LVCNet: Efficient Condition-Dependent Modeling Network for Waveform Generation
In this paper, we propose a novel conditional convolution network, named location-variable convolution, to model the dependencies of …
Zhen Zeng
,
Jianzong Wang
,
Ning Cheng
,
Jing Xiao
Cite
arXiv
IEEE
Singer Identification Using Deep Timbre Feature Learning with KNN-NET
In this paper, we study the issue of automatic singer identification (SID) in popular music recordings, which aims to recognize who …
Xulong Zhang
,
Jiale Qian
,
Yi Yu
,
Yifu Sun
,
Wei Li
Cite
Code
Dataset
arXiv
IEEE
Unidirectional Memory-Self-Attention Transducer for Online Speech Recognition
Self-attention models have been successfully applied in end-to-end speech recognition systems, which greatly improve the performance of …
Jian Luo
,
Jianzong Wang
,
Ning Cheng
,
Jing Xiao
Cite
arXiv
IEEE
End-To-End Silent Speech Recognition with Acoustic Sensing
Silent speech interfaces (SSI) has been an exciting area of recent interest. In this paper, we present a non-invasive silent speech …
Jian Luo
,
Jianzong Wang
,
Ning Cheng
,
Guilin Jiang
,
Jing Xiao
Cite
arXiv
IEEE
GraphPB: Graphical Representations of Prosody Boundary in Speech Synthesis
This paper introduces a graphical representation approach of prosody boundary (GraphPB) in the task of Chinese speech synthesis, …
Aolan Sun
,
Jianzong Wang
,
Ning Cheng
,
Huayi Peng
,
Zhen Zeng
,
Lingwei Kong
,
Jing Xiao
Cite
arXiv
IEEE
MelGlow: Efficient Waveform Generative Network Based On Location-Variable Convolution
Recent neural vocoders usually use a WaveNet-like network to capture the long-term dependencies of the waveform, but a large number of …
Zhen Zeng
,
Jianzong Wang
,
Ning Cheng
,
Jing Xiao
Cite
arXiv
IEEE
Multi-Quartznet: Multi-Resolution Convolution for Speech Recognition with Multi-Layer Feature Fusion
In this paper, we propose an end-to-end speech recognition network based on Nvidia’s previous QuartzNet [1] model. We try to …
Jian Luo
,
Jianzong Wang
,
Ning Cheng
,
Guilin Jiang
,
Jing Xiao
Cite
arXiv
IEEE
«
»
Cite
×