Home
People
Events
Research
Publications
Contact
News
1
MetaSID: Singer Identification with Domain Adaptation for Metaverse
Metaverse has stretched the real world into unlimited space. There will be more live concerts in Metaverse. The task of singer …
Xulong Zhang
,
Jianzong Wang
,
Ning Cheng
,
Jing Xiao
Cite
arXiv
IEEE
Singer Identification for Metaverse with Timbral and Middle-Level Perceptual Features
Metaverse is an interactive world that combines reality and virtuality, where participants can be virtual avatars. Anyone can hold a …
Xulong Zhang
,
Jianzong Wang
,
Ning Cheng
,
Jing Xiao
Cite
arXiv
IEEE
Speech Augmentation Based Unsupervised Learning for Keyword Spotting
In this paper, we investigated a speech augmentation based unsupervised learning approach for keyword spotting (KWS) task. KWS is a …
Jian Luo
,
Jianzong Wang
,
Ning Cheng
,
Haobin Tang
,
Jing Xiao
Cite
arXiv
IEEE
SUSing: SU-net for Singing Voice Synthesis
Singing voice synthesis is a generative task that involves multi-dimensional control of the singing model, including lyrics, pitch, and …
Xulong Zhang
,
Jianzong Wang
,
Ning Cheng
,
Jing Xiao
Cite
arXiv
IEEE
TDASS: Target Domain Adaptation Speech Synthesis Framework for Multi-speaker Low-Resource TTS
Recently, synthesizing personalized speech by text-to-speech (TTS) application is highly demanded. But the previous TTS models require …
Xulong Zhang
,
Jianzong Wang
,
Ning Cheng
,
Jing Xiao
Cite
arXiv
IEEE
AVQVC: One-Shot Voice Conversion By Vector Quantization With Applying Contrastive Learning
Voice Conversion(VC) refers to changing the timbre of a speech while retaining the discourse content. Recently, many works have focused …
Huaizhen Tang
,
Xulong Zhang
,
Jianzong Wang
,
Ning Cheng
,
Jing Xiao
Cite
arXiv
IEEE
DRVC: A Framework of Any-to-Any Voice Conversion with Self-Supervised Learning
Any-to-any voice conversion problem aims to convert voices for source and target speakers, which are out of the training data. Previous …
Qiqi Wang
,
Xulong Zhang
,
Jianzong Wang
,
Ning Cheng
,
Jing Xiao
Cite
Slides
arXiv
IEEE
nnSpeech: Speaker-Guided Conditional Variational Autoencoder for Zero-Shot Multi-speaker text-to-speech
Multi-speaker text-to-speech (TTS) using a few adaption data is a challenge in practical applications. To address that, we propose a …
Botao Zhao
,
Xulong Zhang
,
Jianzong Wang
,
Ning Cheng
,
Jing Xiao
Cite
arXiv
IEEE
Self-Attention for Incomplete Utterance Rewriting
Incomplete utterance rewriting (IUR) has recently become an essential task in NLP, aiming to complement the incomplete utterance with …
Yong Zhang
,
Zhitao Li
,
Jianzong Wang
,
Ning Cheng
,
Jing Xiao
Cite
arXiv
IEEE
A Fair Federated Learning Framework With Reinforcement Learning
Federated learning (FL) is a paradigm where many clients collaboratively train a model under the coordination of a central server, …
Yaqi Sun
,
Shijing Si
,
Jianzong Wang
,
Yuhan Dong
,
Zhitao Zhu
,
Jing Xiao
Cite
arXiv
IEEE
«
»
Cite
×