1

CACnet: Cube Attentional CNN for Automatic Speech Recognition

End-to-end models have been widely used in Automatic Speech Recognition (ASR). Convolutional Neural Networks (CNNs) can effectively use …

Nan Zhang, Jianzong Wang, Wenqi Wei, Xiaoyang Qu, Ning Cheng, Jing Xiao

Loss Prediction: End-to-End Active Learning Approach For Speech Recognition

End-to-end speech recognition systems usually require huge amounts of labeling resource, while annotating the speech data is …

Jian Luo, Jianzong Wang, Ning Cheng, Jing Xiao

Loss Prediction: End-to-End Active Learning Approach For Speech Recognition

Transfer Ability of Monolingual Wav2vec2.0 for Low-resource Speech Recognition

Recently, there are several domains that have their own feature extractors, such as ResNet, BERT, and GPT-x, which are widely used for …

Cheng Yi, Jianzong Wang, Ning Cheng, Shiyu Zhou, Bo Xu

Transfer Ability of Monolingual Wav2vec2.0 for Low-resource Speech Recognition

图神经网络综述

Jianzong Wang, Lingwei Kong, Zhangcheng Huang, Jing Xiao

Last updated on Aug 21, 2025

LVCNet: Efficient Condition-Dependent Modeling Network for Waveform Generation

In this paper, we propose a novel conditional convolution network, named location-variable convolution, to model the dependencies of …

Zhen Zeng, Jianzong Wang, Ning Cheng, Jing Xiao

Singer Identification Using Deep Timbre Feature Learning with KNN-NET

In this paper, we study the issue of automatic singer identification (SID) in popular music recordings, which aims to recognize who …

Xulong Zhang, Jiale Qian, Yi Yu, Yifu Sun, Wei Li

Singer Identification Using Deep Timbre Feature Learning with KNN-NET

Unidirectional Memory-Self-Attention Transducer for Online Speech Recognition

Self-attention models have been successfully applied in end-to-end speech recognition systems, which greatly improve the performance of …

Jian Luo, Jianzong Wang, Ning Cheng, Jing Xiao

Unidirectional Memory-Self-Attention Transducer for Online Speech Recognition

End-To-End Silent Speech Recognition with Acoustic Sensing

Silent speech interfaces (SSI) has been an exciting area of recent interest. In this paper, we present a non-invasive silent speech …

Jian Luo, Jianzong Wang, Ning Cheng, Guilin Jiang, Jing Xiao

End-To-End Silent Speech Recognition with Acoustic Sensing

GraphPB: Graphical Representations of Prosody Boundary in Speech Synthesis

This paper introduces a graphical representation approach of prosody boundary (GraphPB) in the task of Chinese speech synthesis, …

Aolan Sun, Jianzong Wang, Ning Cheng, Huayi Peng, Zhen Zeng, Lingwei Kong, Jing Xiao