Home
People
Events
Research
Publications
Contact
News
1
CONTUNER: Singing Voice Beautifying with Pitch and Expressiveness Condition
Singing voice beautifying is a novel task that has application value in people’s daily life, aiming to correct the pitch of the …
Jianzong Wang
,
Pengcheng Li
,
Xulong Zhang
,
Ning Cheng
,
Jing Xiao
Cite
arXiv
DEMO
IEEE
EAD-VC: Enhancing Speech Auto-Disentanglement for Voice Conversion with IFUB Estimator and Joint Text-Guided Consistent Learning
Using unsupervised learning to disentangle speech into content, rhythm, pitch, and timbre for voice conversion has become a hot …
Ziqi Liang
,
Jianzong Wang
,
Xulong Zhang
,
Yong Zhang
,
Ning Cheng
,
Jing Xiao
Cite
arXiv
DEMO
IEEE
Efficient Multi-Model Fusion with Adversarial Complementary Representation Learning
Single-model systems often suffer from deficiencies in tasks such as speaker verification (SV) and image classification, relying …
Zuheng Kang
,
Yayun He
,
Jianzong Wang
,
Junqing Peng
,
Jing Xiao
Cite
arXiv
IEEE
EfficientASR: Speech Recognition Network Compression via Attention Redundancy and Chunk-Level FFN Optimization
In recent years, Transformer networks have shown remarkable performance in speech recognition tasks. However, their deployment poses …
Jianzong Wang
,
Ziqi Liang
,
Xulong Zhang
,
Ning Cheng
,
Jing Xiao
Cite
arXiv
IEEE
Enhancing Anomalous Sound Detection with Multi-Level Memory Bank
Abnormal sound detection (ASD) is crucial for the timely detection of machine faults in industrial scenarios and has emerged as a …
Baoping Deng
,
Jinggang Chen
,
Zhenhou Hong
,
Xiaoyang Qu
,
Guokuan Li
,
Jiguang Wan
,
Changsheng Xie
,
Jianzong Wang
Cite
IEEE
Learning Expressive Disentangled Speech Representations with Soft Speech Units and Adversarial Style Augmentation
Voice conversion is the task to transform voice characteristics of source speech while preserving content information. Nowadays, …
Yimin Deng
,
Jianzong Wang
,
Xulong Zhang
,
Ning Cheng
,
Jing Xiao
Cite
arXiv
IEEE
MAIN-VC: Lightweight Speech Representation Disentanglement for One-Shot Voice Conversion
One-shot voice conversion aims to change the timbre of any source speech to match that of the unseen target speaker with only one …
Pengcheng Li
,
Jianzong Wang
,
Xulong Zhang
,
Yong Zhang
,
Jing Xiao
,
Ning Cheng
Cite
arXiv
DEMO
IEEE
PRENet: A Plane-Fit Redundancy Encoding Point Cloud Sequence Network for Real-Time 3D Action Recognition
Recognizing human actions from point cloud sequence has attracted tremendous attention from both academia and industry due to its wide …
Shenglin He
,
Xiaoyang Qu
,
Jiguang Wan
,
Guokuan Li
,
Changsheng Xie
,
Jianzong Wang
Cite
arXiv
IEEE
QLSC: A Query Latent Semantic Calibrator for Robust Extractive Question Answering
Extractive Question Answering (EQA) in Machine Reading Comprehension (MRC) often faces the challenge of dealing with semantically …
Sheng Ouyang
,
Jianzong Wang
,
Yong Zhang
,
Zhitao Li
,
Ziqi Liang
,
Xulong Zhang
,
Ning Cheng
,
Jing Xiao
Cite
arXiv
IEEE
Task-Agnostic Decision Transformer for Multi-Type Agent Control with Federated Split Training
With the rapid advancements in artificial intelligence, the development of knowledgeable and personalized agents has become …
Zhiyuan Wang
,
Bokui Chen
,
Xiaoyang Qu
,
Zhenhou Hong
,
Jing Xiao
,
Jianzong Wang
Cite
arXiv
IEEE
«
»
Cite
×