Home
People
Events
Research
Publications
Contact
News
1
SAR: Self-Supervised Anti-Distortion Representation for End-To-End Speech Model
In recent Text-to-Speech (TTS) systems, a neural vocoder often generates speech samples by solely conditioning on acoustic features …
Jianzong Wang
,
Xulong Zhang
,
Haobin Tang
,
Aolan Sun
,
Ning Cheng
,
Jing Xiao
Cite
arXiv
IEEE
Dynamic Alignment Mask CTC: Improved Mask-CTC with Aligned Cross Entropy
Because of predicting all the target tokens in parallel, the non-autoregressive models greatly improve the decoding efficiency of …
Xulong Zhang
,
Haobin Tang
,
Jianzong Wang
,
Ning Cheng
,
Jian Luo
,
Jing Xiao
Cite
arXiv
IEEE
Improving EEG-based Emotion Recognition by Fusing Time-frequency And Spatial Representations
Using deep learning methods to classify EEG signals can accurately identify people’s emotions. However, existing studies have …
Kexin Zhu
,
Xulong Zhang
,
Jianzong Wang
,
Ning Cheng
,
Jing Xiao
Cite
Poster
arXiv
IEEE
Improving Music Genre Classification from Multi-modal Properties of Music and Genre Correlations Perspective
Music genre classification has been widely studied in past few years for its various applications in music information retrieval. …
Ganghui Ru
,
Xulong Zhang
,
Jianzong Wang
,
Ning Cheng
,
Jing Xiao
Cite
Poster
arXiv
IEEE
Learning Speech Representations with Flexible Hidden Feature Dimensions
Non-parallel many-to-many voice conversion is a kind of style transfer task in speech. Recently, AutoVC has been applied in this field …
Huaizhen Tang
,
Xulong Zhang
,
Jianzong Wang
,
Ning Cheng
,
Jing Xiao
PDF
Cite
IEEE
QI-TTS: Questioning Intonation Control for Emotional Speech Synthesis
Recent expressive text to speech (TTS) models focus on synthesizing emotional speech, but some fine-grained styles such as intonation …
Haobin Tang
,
Xulong Zhang
,
Jianzong Wang
,
Ning Cheng
,
Jing Xiao
Cite
arXiv
IEEE
VQ-CL: Learning Disentangled Speech Representations with Contrastive Learning and Vector Quantization
Voice Conversion(VC) refers to converting the voice char- acteristics of audio to another one as it is said by other people. Recently, …
Huaizhen Tang
,
Xulong Zhang
,
Jianzong Wang
,
Ning Cheng
,
Jing Xiao
PDF
Cite
IEEE
Feature-Rich Audio Model Inversion for Data-Free Knowledge Distillation Towards General Sound Classification
Data-Free Knowledge Distillation (DFKD) has recently attracted growing attention in the academic community, especially with major …
Zuheng Kang
,
Yayun He
,
Jianzong Wang
,
Junqing Peng
,
Xiaoyang Qu
,
Jing Xiao
Cite
arXiv
IEEE
Cross-grained Contrastive Representation for Unsupervised Lesion Segmentation in Medical Images
Ziqi Yu
,
Botao Zhao
,
Yipin Zhang
,
Shengjie Zhang
,
Xiang Chen
,
Haibo Yang
,
Tingying Peng
,
Xiao-Yong Zhang
Cite
Personalized Federated Learning via Gradient Modulation for Heterogeneous Text Summarization
Text summarization is essential for information aggregation and demands large amounts of training data. However, concerns about data …
Rongfeng Pan
,
Jianzong Wang
,
Lingwei Kong
,
Zhangcheng Huang
,
Jing Xiao
Cite
arXiv
IEEE
«
»
Cite
×