Ning Cheng
Publications
- Superfiltering: Weak-to-Strong Data Filtering for Fast Instruction-Tuning (2024), In ACL2024 (CCF-A)
- Enhancing Emotion Prediction and Recognition in Conversation through Fine-Grained Emotional Cue Analysis and Cross-Modal Fusion (2024), In ICIC2024 (CCF-C)
- RREH: Reconstruction Relations Embedded Hashing for Semi-Paired Cross-Modal Retrieval (2024), In ICIC2024 (CCF-C)
- RSET: Remapping-based Sorting Method for Emotion Transfer Speech Synthesis (2024), In APWeb2024 (CCF-C)
- CONTUNER: Singing Voice Beautifying with Pitch and Expressiveness Condition (2024), In IJCNN2024 (CCF-C)
- EAD-VC: Enhancing Speech Auto-Disentanglement for Voice Conversion with IFUB Estimator and Joint Text-Guided Consistent Learning (2024), In IJCNN2024 (CCF-C)
- EfficientASR: Speech Recognition Network Compression via Attention Redundancy and Chunk-Level FFN Optimization (2024), In IJCNN2024 (CCF-C)
- Learning Expressive Disentangled Speech Representations with Soft Speech Units and Adversarial Style Augmentation (2024), In IJCNN2024 (CCF-C)
- MAIN-VC: Lightweight Speech Representation Disentanglement for One-Shot Voice Conversion (2024), In IJCNN2024 (CCF-C)
- QLSC: A Query Latent Semantic Calibrator for Robust Extractive Question Answering (2024), In IJCNN2024 (CCF-C)
- From Quantity to Quality: Boosting LLM Performance with Self-Guided Data Selection for Instruction Tuning (2024), In NAACL2024 (CCF-B)
- Medical Speech Symptoms Classification via Disentangled Representation (2024), In CSCWD2024 (CCF-C)
- EmoTalker: Emotionally Editable Talking Face Generation via Diffusion Model (2024), In ICASSP2024 (CCF-B)
- ED-TTS: Multi-Scale Emotion Modeling Using Cross-Domain Emotion Diarization for Emotional Speech Synthesis (2024), In ICASSP2024 (CCF-B)
- Learning Disentangled Speech Representations with Contrastive Learning and Time-Invariant Retrieval (2024), In ICASSP2024 (CCF-B)
- Leveraging Biases in Large Language Models: bias-kNN for Effective Few-Shot Learning (2024), In ICASSP2024 (CCF-B)
- Research on Audio Model Generation Technology Based on Hierarchical Federated Framework (2024), In CAAI TIT
- On the Calibration and Uncertainty with Pólya-Gamma Augmentation for Dialog Retrieval Models (2023), In AAAI2023 (CCF-A)
- PMVC: Data Augmentation-Based Prosody Modeling for Expressive Voice Conversion (2023), In MM2023 (CCF-A)
- CLN-VC: Text-Free Voice Conversion Based on Fine-Grained Style Control and Contrastive Learning with Negative Samples Augmentation (2023), In SpaCCS2023
- CP-EB: Talking Face Generation with Controllable Pose and Eye Blinking Embedding (2023), In ISPA2023 (CCF-C)
- DQR-TTS: Semi-supervised Text-to-speech Synthesis with Dynamic Quantized Representation (2023), In BDCloud2023
- PRCA: Fitting Black-Box Large Language Models for Retrieval Question Answering via Pluggable Reward-Driven Contextual Adapter (2023), In EMNLP2023 (CCF-B)
- AOSR-Net: All-in-One Sandstorm Removal Network (2023), In ICTAI2023 (CCF-C)
- Contrastive Latent Space Reconstruction Learning for Audio-Text Retrieval (2023), In ICTAI2023 (CCF-C)
- FastGraphTTS: An Ultrafast Syntax-Aware Speech Synthesis Framework (2023), In ICTAI2023 (CCF-C)
- DiffTalker: Co-driven audio-image diffusion for talking faces via intermediate landmarks (2023), In arXiv (work in progress)
- Machine Unlearning Methodology base on Stochastic Teacher Network (2023), In ADMA2023 (CCF-C)
- Symbolic and Acoustic: Multi-domain Music Emotion Modeling for Instrumental Music (2023), In ADMA2023 (CCF-C)
- Voice Conversion with Denoising Diffusion Probabilistic GAN Models (2023), In ADMA2023 (CCF-C)
- Boosting Chinese ASR Error Correction with Dynamic Error Scaling Mechanism (2023), In INTERSPEECH2023 (CCF-C)
- EmoMix: Emotion Mixing via Diffusion Models for Emotional Speech Synthesis (2023), In INTERSPEECH2023 (CCF-C)
- Investigation of Music Emotion Recognition Based on Segmented Semi-Supervised Learning (2023), In INTERSPEECH2023 (CCF-C)
- Prompt Guided Copy Mechanism for Conversational Question Answering (2023), In INTERSPEECH2023 (CCF-C)
- SAR: Self-Supervised Anti-Distortion Representation for End-To-End Speech Model (2023), In IJCNN2023 (CCF-C)
- Dynamic Alignment Mask CTC: Improved Mask-CTC with Aligned Cross Entropy (2023), In ICASSP2023 (CCF-B)
- Improving EEG-based Emotion Recognition by Fusing Time-frequency And Spatial Representations (2023), In ICASSP2023 (CCF-B)
- Improving Music Genre Classification from Multi-modal Properties of Music and Genre Correlations Perspective (2023), In ICASSP2023 (CCF-B)
- Learning Speech Representations with Flexible Hidden Feature Dimensions (2023), In ICASSP2023 (CCF-B)
- QI-TTS: Questioning Intonation Control for Emotional Speech Synthesis (2023), In ICASSP2023 (CCF-B)
- VQ-CL: Learning Disentangled Speech Representations with Contrastive Learning and Vector Quantization (2023), In ICASSP2023 (CCF-B)
- Adapitch: Adaption Multi-Speaker Text-to-Speech Conditioned on Pitch Disentangling with Untranscribed Data (2022), In MSN2022 (CCF-C)
- Improving Imbalanced Text Classification with Dynamic Curriculum Learning (2022), In MSN2022 (CCF-C)
- Improving Speech Representation Learning via Speech-level and Phoneme-level Masking Approach (2022), In MSN2022 (CCF-C)
- Linguistic-Enhanced Transformer with CTC Embedding for Speech Recognition (2022), In MSN2022 (CCF-C)
- MetaSpeech: Speech Effects Switch Along with Environment for Metaverse (2022), In MSN2022 (CCF-C)
- Semi-Supervised Learning Based on Reference Model for Low-resource TTS (2022), In MSN2022 (CCF-C)
- Shallow Diffusion Motion Model for Talking Face Generation from Speech (2022), In APWeb-WAIM2022 (CCF-C)
- Boosting Star-GANs for Voice Conversion with Contrastive Discriminator (2022), In ICONIP2022 (CCF-C)
- Pre-Avatar: An Automatic Presentation Generation Framework Leveraging Talking Avatar (2022), In ICTAI2022 (CCF-C)
- Speech Representation Disentanglement with Adversarial Mutual Information Learning for One-shot Voice Conversion (2022), In INTERSPEECH2022 (CCF-C)
- Tiny-Sepformer: A Tiny Time-Domain Transformer Network For Speech Separation (2022), In INTERSPEECH2022 (CCF-C)
- Uncertainty Calibration for Deep Audio Classifiers (2022), In INTERSPEECH2022 (CCF-C)
- Adaptive Activation Network for Low Resource Multilingual Speech Recognition (2022), In IJCNN2022 (CCF-C)
- MDCNN-SID: Multi-scale Dilated Convolution Network for Singer Identification (2022), In IJCNN2022 (CCF-C)
- MetaSID: Singer Identification with Domain Adaptation for Metaverse (2022), In IJCNN2022 (CCF-C)
- Singer Identification for Metaverse with Timbral and Middle-Level Perceptual Features (2022), In IJCNN2022 (CCF-C)
- Speech Augmentation Based Unsupervised Learning for Keyword Spotting (2022), In IJCNN2022 (CCF-C)
- SUSing: SU-net for Singing Voice Synthesis (2022), In IJCNN2022 (CCF-C)
- TDASS: Target Domain Adaptation Speech Synthesis Framework for Multi-speaker Low-Resource TTS (2022), In IJCNN2022 (CCF-C)
- AVQVC: One-Shot Voice Conversion By Vector Quantization With Applying Contrastive Learning (2022), In ICASSP2022 (CCF-B)
- DRVC: A Framework of Any-to-Any Voice Conversion with Self-Supervised Learning (2022), In ICASSP2022 (CCF-B)
- nnSpeech: Speaker-Guided Conditional Variational Autoencoder for Zero-Shot Multi-speaker text-to-speech (2022), In ICASSP2022 (CCF-B)
- Self-Attention for Incomplete Utterance Rewriting (2022), In ICASSP2022 (CCF-B)
- Blur the Linguistic Boundary: Interpreting Chinese Buddhist Sutra in English via Neural Machine Translation (2022), In ICTAI2022 (CCF-C)
- Supervised Contrastive Meta-learning for Few-Shot Classification (2022), In HPCC2022 (CCF-C)
- VU-BERT: A Unified Framework for Visual Dialog (2022), In ICASSP2022 (CCF-B)
- CycleGEAN: Cycle Generative Enhanced Adversarial Network for Voice Conversion (2021), In ASRU2021
- Reconstructing Dual Learning for Neural Voice Conversion Using Relatively Few Samples (2021), In ASRU2021
- TGAVC: Improving Autoencoder Voice Conversion with Text-Guided and Adversarial Training (2021), In ASRU2021
- Dropout Regularization for Self-Supervised Learning of Transformer Encoder Speech Representation (2021), In INTERSPEECH2021 (CCF-C)
- Speech2Video: Cross-Modal Distillation for Speech to Video Generation (2021), In INTERSPEECH2021 (CCF-C)
- Variational Information Bottleneck for Effective Low-Resource Audio Classification (2021), In INTERSPEECH2021 (CCF-C)
- A Language Model Based Pseudo-Sample Deliberation for Semi-supervised Speech Recognition (2021), In IJCNN2021 (CCF-C)
- CACnet: Cube Attentional CNN for Automatic Speech Recognition (2021), In IJCNN2021 (CCF-C)
- Loss Prediction: End-to-End Active Learning Approach For Speech Recognition (2021), In IJCNN2021 (CCF-C)
- Transfer Ability of Monolingual Wav2vec2.0 for Low-resource Speech Recognition (2021), In IJCNN2021 (CCF-C)
- Cross-Language Transfer Learning and Domain Adaptation for End-to-End Automatic Speech Recognition (2021), In ICME2021 (CCF-B)
- LVCNet: Efficient Condition-Dependent Modeling Network for Waveform Generation (2021), In ICASSP2021 (CCF-B)
- Unidirectional Memory-Self-Attention Transducer for Online Speech Recognition (2021), In ICASSP2021 (CCF-B)
- End-To-End Silent Speech Recognition with Acoustic Sensing (2021), In SLT2021
- GraphPB: Graphical Representations of Prosody Boundary in Speech Synthesis (2021), In SLT2021
- MelGlow: Efficient Waveform Generative Network Based On Location-Variable Convolution (2021), In SLT2021
- Multi-Quartznet: Multi-Resolution Convolution for Speech Recognition with Multi-Layer Feature Fusion (2021), In SLT2021
- A Novel Capsule Aggregation Framework for Natural Language Inference (2021), In APWeb-WAIM2021 (CCF-C)
- Joint Intent Detection and Slot Filling Based on Continual Learning Model (2021), In ICASSP2021 (CCF-B)
- Self-supervised Learning for Semantic Sentence Matching with Dense Transformer Inference Network (2021), In APWeb-WAIM2021 (CCF-C)
- Semantic Embedding Graph Convolutional Networks for Multi-label Video Segment Classification (2021), In PAAP2021
- Semantic Extraction for Sentence Representation via Reinforcement Learning (2021), In IJCNN2021 (CCF-C)
- A Real-Time Robot-Based Auxiliary System for Risk Evaluation of COVID-19 Infection (2020), In INTERSPEECH2020 (CCF-C)
- Large-Scale Transfer Learning for Low-Resource Spoken Language Understanding (2020), In INTERSPEECH2020 (CCF-C)
- MLNET: An Adaptive Multiple Receptive-Field Attention Neural Network for Voice Activity Detection (2020), In INTERSPEECH2020 (CCF-C)
- Prosody Learning Mechanism for Speech Synthesis System Without Text Length Limit (2020), In INTERSPEECH2020 (CCF-C)
- Aligntts: Efficient Feed-Forward Text-to-Speech System Without Explicit Alignment (2020), In ICASSP2020 (CCF-B)
- GraphTTS: Graph-to-Sequence Modelling in Neural Text-to-Speech (2020), In ICASSP2020 (CCF-B)
- Chinese Punctuation Prediction with Adaptive Attention and Dependency Tree (2020), In CCKS2020
- Epidemic Guard: A COVID-19 Detection System for Elderly People (2020), In APWeb-WAIM2020 (CCF-C)
中文期刊文章
- 沙尘图像视觉增强技术综述 (2025), 《大数据》,11 (01),(CCF-T2)
- 基于分层联邦框架的音频模型生成技术研究 (2024), 《智能系统学报》(CCF-T2,北大核心)
- 基于生成对抗网络的多特征融合去雾技术 (2024), 《大数据》,10 (04),(CCF-T2)
- 情感语音合成综述 (2024), 《大数据》,10 (05),(CCF-T2)
- 数字说话人脸生成技术综述 (2024), 《大数据》,10 (05),(CCF-T2)
- 联邦学习的公平性研究综述 (2024), 《大数据》,10 (01),(CCF-T2)
- 面向非平行语料的语音转换技术综述 (2024), 《大数据》,10 (03),(CCF-T2)
- 表现性语音合成综述 (2023), 《大数据》,9 (06),(CCF-T2)