Home
People
Events
Research
Publications
Contact
News
1
Feature-Rich Audio Model Inversion for Data-Free Knowledge Distillation Towards General Sound Classification
Data-Free Knowledge Distillation (DFKD) has recently attracted growing attention in the academic community, especially with major …
Zuheng Kang
,
Yayun He
,
Jianzong Wang
,
Junqing Peng
,
Xiaoyang Qu
,
Jing Xiao
Cite
arXiv
IEEE
Cross-grained Contrastive Representation for Unsupervised Lesion Segmentation in Medical Images
Ziqi Yu
,
Botao Zhao
,
Yipin Zhang
,
Shengjie Zhang
,
Xiang Chen
,
Haibo Yang
,
Tingying Peng
,
Xiao-Yong Zhang
Cite
Personalized Federated Learning via Gradient Modulation for Heterogeneous Text Summarization
Text summarization is essential for information aggregation and demands large amounts of training data. However, concerns about data …
Rongfeng Pan
,
Jianzong Wang
,
Lingwei Kong
,
Zhangcheng Huang
,
Jing Xiao
Cite
arXiv
IEEE
SVLDL: Improved Speaker Age Estimation Using Selective Variance Label Distribution Learning
Estimating age from a single speech is a classic and challenging topic. Although Label Distribution Learning (LDL) can represent …
Zuheng Kang
,
Jianzong Wang
,
Junqing Peng
,
Jing Xiao
Cite
arXiv
IEEE
Adapitch: Adaption Multi-Speaker Text-to-Speech Conditioned on Pitch Disentangling with Untranscribed Data
In this paper, we proposed Adapitch, a multi-speaker TTS method that makes adaptation of the supervised module with untranscribed data. …
Xulong Zhang
,
Jianzong Wang
,
Ning Cheng
,
Jing Xiao
Cite
arXiv
IEEE
Improving Imbalanced Text Classification with Dynamic Curriculum Learning
Recent advances in pre-trained language models have improved the performance for text classification tasks. However, little attention …
Xulong Zhang
,
Jianzong Wang
,
Ning Cheng
,
Jing Xiao
Cite
arXiv
IEEE
Improving Speech Representation Learning via Speech-level and Phoneme-level Masking Approach
Recovering the masked speech frames is widely applied in speech representation learning. However, most of these models use random …
Xulong Zhang
,
Jianzong Wang
,
Ning Cheng
,
Kexin Zhu
,
Jing Xiao
Cite
arXiv
IEEE
Linguistic-Enhanced Transformer with CTC Embedding for Speech Recognition
The recent emergence of joint CTC-Attention model shows significant improvement in automatic speech recognition (ASR). The improvement …
Xulong Zhang
,
Jianzong Wang
,
Ning Cheng
,
Mengyuan Zhao
,
Zhiyong Zhang
,
Jing Xiao
Cite
arXiv
IEEE
MetaSpeech: Speech Effects Switch Along with Environment for Metaverse
Metaverse expands the physical world to a new dimension, and the physical environment and Metaverse environment can be directly …
Xulong Zhang
,
Jianzong Wang
,
Ning Cheng
,
Jing Xiao
Cite
arXiv
IEEE
Semi-Supervised Learning Based on Reference Model for Low-resource TTS
Most previous neural text-to-speech (TTS) methods are mainly based on supervised learning methods, which means they depend on a large …
Xulong Zhang
,
Jianzong Wang
,
Ning Cheng
,
Jing Xiao
Cite
arXiv
IEEE
«
»
Cite
×