Home
People
Events
Research
Publications
Contact
News
Audio
GraphPB: Graphical Representations of Prosody Boundary in Speech Synthesis
This paper introduces a graphical representation approach of prosody boundary (GraphPB) in the task of Chinese speech synthesis, …
Aolan Sun
,
Jianzong Wang
,
Ning Cheng
,
Huayi Peng
,
Zhen Zeng
,
Lingwei Kong
,
Jing Xiao
Cite
arXiv
IEEE
MelGlow: Efficient Waveform Generative Network Based On Location-Variable Convolution
Recent neural vocoders usually use a WaveNet-like network to capture the long-term dependencies of the waveform, but a large number of …
Zhen Zeng
,
Jianzong Wang
,
Ning Cheng
,
Jing Xiao
Cite
arXiv
IEEE
Multi-Quartznet: Multi-Resolution Convolution for Speech Recognition with Multi-Layer Feature Fusion
In this paper, we propose an end-to-end speech recognition network based on Nvidia’s previous QuartzNet [1] model. We try to …
Jian Luo
,
Jianzong Wang
,
Ning Cheng
,
Guilin Jiang
,
Jing Xiao
Cite
arXiv
IEEE
Communication-Memory-Efficient Decentralized Learning For Audio Representation
Smartphones and wearable devices produce a wealth of audio data, which cannot be accumulated in a centralized repository for learning …
Leilai Li
,
Jianzong Wang
,
Xiaoyang Qu
,
Jing Xiao
Cite
IEEE
Contrastive Learning for improving End-to-end Speaker Verification
Speaker verification involves examining the speech signal to authenticate the claim of a speaker as true or false. Deep neural networks …
Yanxi Tang
,
Jianzong Wang
,
Xiaoyang Qu
,
Jing Xiao
Cite
IEEE
Effective Phase Encoding for End-To-End Speaker Verification
Click the Cite button above to demo the feature to enable visitors to import publication metadata into their reference management software.
Junyi Peng
,
Xiaoyang Qu
,
Rongzhi Gu
,
Jianzong Wang
,
Jing Xiao
,
Lukás Burget
,
Jan Cernocký
Cite
ISCA
Best Student Paper Award
Federated Learning with Dynamic Transformer for Text to Speech
Click the Cite button above to demo the feature to enable visitors to import publication metadata into their reference management software.
Zhenhou Hong
,
Jianzong Wang
,
Xiaoyang Qu
,
Jie Liu
,
Chendong Zhao
,
Jing Xiao
PDF
Cite
arXiv
ISCA
ICSpk: Interpretable Complex Speaker Embedding Extractor from Raw Waveform
Click the Cite button above to demo the feature to enable visitors to import publication metadata into their reference management software.
Junyi Peng
,
Xiaoyang Qu
,
Jianzong Wang
,
Rongzhi Gu
,
Jing Xiao
,
Lukás Burget
,
Jan Cernocký
PDF
Cite
ISCA
When Hearing the Voice, Who Will Come to Your Mind
Speech is a carrier containing rich biological information, such as speaker identity information including age, gender, race. In this …
Zhenhou Hong
,
Jianzong Wang
,
Wenqi Wei
,
Jie Liu
,
Xiaoyang Qu
,
Bo Chen
,
Zihang Wei
,
Jing Xiao
Cite
IEEE
A Real-Time Robot-Based Auxiliary System for Risk Evaluation of COVID-19 Infection
In this paper, we propose a real-time robot-based auxiliary system for risk evaluation of COVID-19 infection. It combines real-time …
Wenqi Wei
,
Jianzong Wang
,
Jiteng Ma
,
Ning Cheng
,
Jing Xiao
PDF
Cite
ISCA
«
»
Cite
×