Home
People
Events
Research
Publications
Contact
News
Audio
Large-Scale Transfer Learning for Low-Resource Spoken Language Understanding
End-to-end Spoken Language Understanding (SLU) models are made increasingly large and complex to achieve the state-of-the-art accuracy. …
Xueli Jia
,
Jianzong Wang
,
Zhiyong Zhang
,
Ning Cheng
,
Jing Xiao
PDF
Cite
arXiv
ISCA
MLNET: An Adaptive Multiple Receptive-Field Attention Neural Network for Voice Activity Detection
Voice activity detection (VAD) makes a distinction between speech and non-speech and its performance is of crucial importance for …
Zhenpeng Zheng
,
Jianzong Wang
,
Ning Cheng
,
Jian Luo
,
Jing Xiao
PDF
Cite
arXiv
ISCA
Prosody Learning Mechanism for Speech Synthesis System Without Text Length Limit
Recent neural speech synthesis systems have gradually focused on the control of prosody to improve the quality of synthesized speech, …
Zhen Zeng
,
Jianzong Wang
,
Ning Cheng
,
Jing Xiao
PDF
Cite
arXiv
ISCA
Research on Singing Voice Detection Based on a Long-Term Recurrent Convolutional Network with Vocal Separation and Temporal Smoothing
Singing voice detection or vocal detection is a classification task that determines whether a given audio segment contains singing …
Xulong Zhang
,
Yi Yu
,
Yongwei Gao
,
Xi Chen
,
Wei Li
Cite
Electronics
Aligntts: Efficient Feed-Forward Text-to-Speech System Without Explicit Alignment
Targeting at both high efficiency and performance, we propose AlignTTS to predict the mel-spectrum in parallel. AlignTTS is based on a …
Zhen Zeng
,
Jianzong Wang
,
Ning Cheng
,
Tian Xia
,
Jing Xiao
Cite
arXiv
IEEE
GraphTTS: Graph-to-Sequence Modelling in Neural Text-to-Speech
This paper leverages the graph-to-sequence method in neural text-to-speech (GraphTTS), which maps the graph embedding of the input …
Aolan Sun
,
Jianzong Wang
,
Ning Cheng
,
Huayi Peng
,
Zhen Zeng
,
Jing Xiao
Cite
arXiv
IEEE
A Robust Speaker Clustering Method Based on Discrete Tied Variational Autoencoder
Click the Cite button above to demo the feature to enable visitors to import publication metadata into their reference management software.
Chen Feng
,
Jianzong Wang
,
Tongxu Li
,
Junqing Peng
,
Jing Xiao
Cite
IEEE
Evolutionary Algorithm Enhanced Neural Architecture Search for Text-Independent Speaker Verification
State-of-the-art speaker verification models are based on deep learning techniques, which heavily depend on the handdesigned neural …
Xiaoyang Qu
,
Jianzong Wang
,
Jing Xiao
PDF
Cite
arXiv
ISCA
Singing Voice Detection Using Multi-Feature Deep Fusion with CNN
The problem of singing voice detection is to segment a song into vocal and non-vocal parts. Commonly used methods usually train a model …
Xulong Zhang
,
Shengchen Li
,
Zijin Li
,
Shizhe Chen
,
Yongwei Gao
,
Wei Li
PDF
Cite
Springer
Transfer Learning for Music Classification and Regression Tasks Using Artist Tags
In this paper, a transfer learning method that exploits artist tags for general-purpose music feature vector extraction is presented. …
Lei Wang
,
Hongning Zhu
,
Xulong Zhang
,
Shengchen Li
,
Wei Li
Cite
Springer
«
»
Cite
×