Home
People
Events
Research
Publications
Contact
News
ASR
EfficientASR: Speech Recognition Network Compression via Attention Redundancy and Chunk-Level FFN Optimization
In recent years, Transformer networks have shown remarkable performance in speech recognition tasks. However, their deployment poses …
Jianzong Wang
,
Ziqi Liang
,
Xulong Zhang
,
Ning Cheng
,
Jing Xiao
Cite
arXiv
Boosting Chinese ASR Error Correction with Dynamic Error Scaling Mechanism
Chinese Automatic Speech Recognition (ASR) error correction presents significant challenges due to the Chinese language’s unique …
Jiaxin Fan
,
Yong Zhang
,
Hanzhang Li
,
Jianzong Wang
,
Zhitao Li
,
Sheng Ouyang
,
Ning Cheng
,
Jing Xiao
PDF
Cite
arXiv
ISCA
Dynamic Alignment Mask CTC: Improved Mask-CTC with Aligned Cross Entropy
Because of predicting all the target tokens in parallel, the non-autoregressive models greatly improve the decoding efficiency of …
Xulong Zhang
,
Haobin Tang
,
Jianzong Wang
,
Ning Cheng
,
Jian Luo
,
Jing Xiao
Cite
arXiv
IEEE
Improving Speech Representation Learning via Speech-level and Phoneme-level Masking Approach
Recovering the masked speech frames is widely applied in speech representation learning. However, most of these models use random …
Xulong Zhang
,
Jianzong Wang
,
Ning Cheng
,
Kexin Zhu
,
Jing Xiao
Cite
arXiv
IEEE
Linguistic-Enhanced Transformer with CTC Embedding for Speech Recognition
The recent emergence of joint CTC-Attention model shows significant improvement in automatic speech recognition (ASR). The improvement …
Xulong Zhang
,
Jianzong Wang
,
Ning Cheng
,
Mengyuan Zhao
,
Zhiyong Zhang
,
Jing Xiao
Cite
arXiv
IEEE
Adaptive Activation Network for Low Resource Multilingual Speech Recognition
Low resource automatic speech recognition (ASR) is a useful but thorny task, since deep learning ASR models usually need huge amounts …
Jian Luo
,
Jianzong Wang
,
Ning Cheng
,
Zhenpeng Zheng
,
Jing Xiao
Cite
arXiv
IEEE
Adaptive Sparse and Monotonic Attention for Transformer-based Automatic Speech Recognition
The Transformer architecture model, based on self-attention and multi-head attention, has achieved remarkable success in offline …
Chendong Zhao
,
Jianzong Wang
,
Wenqi Wei
,
Xiaoyang Qu
,
Haoqian Wang
,
Jing Xiao
Cite
arXiv
IEEE
A Language Model Based Pseudo-Sample Deliberation for Semi-supervised Speech Recognition
End-to-end modeling requires tremendous amounts of transcribed speech to achieve an automatic speech recognition (ASR) model with high …
Cheng Yi
,
Jianzong Wang
,
Ning Cheng
,
Shiyu Zhou
,
Bo Xu
Cite
IEEE
CACnet: Cube Attentional CNN for Automatic Speech Recognition
End-to-end models have been widely used in Automatic Speech Recognition (ASR). Convolutional Neural Networks (CNNs) can effectively use …
Nan Zhang
,
Jianzong Wang
,
Wenqi Wei
,
Xiaoyang Qu
,
Ning Cheng
,
Jing Xiao
Cite
IEEE
Loss Prediction: End-to-End Active Learning Approach For Speech Recognition
End-to-end speech recognition systems usually require huge amounts of labeling resource, while annotating the speech data is …
Jian Luo
,
Jianzong Wang
,
Ning Cheng
,
Jing Xiao
Cite
arXiv
IEEE
»
Cite
×