Ning Cheng

Publications

  1. Superfiltering: Weak-to-Strong Data Filtering for Fast Instruction-Tuning (2024) In ACL2024 (CCF-A)
  2. Enhancing Emotion Prediction and Recognition in Conversation through Fine-Grained Emotional Cue Analysis and Cross-Modal Fusion (2024) In ICIC2024 (CCF-C)
  3. RREH: Reconstruction Relations Embedded Hashing for Semi-Paired Cross-Modal Retrieval (2024) In ICIC2024 (CCF-C)
  4. RSET: Remapping-based Sorting Method for Emotion Transfer Speech Synthesis (2024) In APWeb2024 (CCF-C)
  5. CONTUNER: Singing Voice Beautifying with Pitch and Expressiveness Condition (2024) In IJCNN2024 (CCF-C)
  6. EAD-VC: Enhancing Speech Auto-Disentanglement for Voice Conversion with IFUB Estimator and Joint Text-Guided Consistent Learning (2024) In IJCNN2024 (CCF-C)
  7. EfficientASR: Speech Recognition Network Compression via Attention Redundancy and Chunk-Level FFN Optimization (2024) In IJCNN2024 (CCF-C)
  8. Learning Expressive Disentangled Speech Representations with Soft Speech Units and Adversarial Style Augmentation (2024) In IJCNN2024 (CCF-C)
  9. MAIN-VC: Lightweight Speech Representation Disentanglement for One-Shot Voice Conversion (2024) In IJCNN2024 (CCF-C)
  10. QLSC: A Query Latent Semantic Calibrator for Robust Extractive Question Answering (2024) In IJCNN2024 (CCF-C)
  11. From Quantity to Quality: Boosting LLM Performance with Self-Guided Data Selection for Instruction Tuning (2024) In NAACL2024 (CCF-B)
  12. Medical Speech Symptoms Classification via Disentangled Representation (2024) In CSCWD2024 (CCF-C)
  13. EmoTalker: Emotionally Editable Talking Face Generation via Diffusion Model (2024) In ICASSP2024 (CCF-B)
  14. ED-TTS: Multi-Scale Emotion Modeling Using Cross-Domain Emotion Diarization for Emotional Speech Synthesis (2024) In ICASSP2024 (CCF-B)
  15. Learning Disentangled Speech Representations with Contrastive Learning and Time-Invariant Retrieval (2024) In ICASSP2024 (CCF-B)
  16. Leveraging Biases in Large Language Models: bias-kNN for Effective Few-Shot Learning (2024) In ICASSP2024 (CCF-B)
  17. Research on Audio Model Generation Technology Based on Hierarchical Federated Framework (2024) In CAAI TIT
  18. On the Calibration and Uncertainty with Pólya-Gamma Augmentation for Dialog Retrieval Models (2023) In AAAI2023 (CCF-A)
  19. PMVC: Data Augmentation-Based Prosody Modeling for Expressive Voice Conversion (2023) In MM2023 (CCF-A)
  20. CLN-VC: Text-Free Voice Conversion Based on Fine-Grained Style Control and Contrastive Learning with Negative Samples Augmentation (2023) In SpaCCS2023
  21. CP-EB: Talking Face Generation with Controllable Pose and Eye Blinking Embedding (2023) In ISPA2023 (CCF-C)
  22. DQR-TTS: Semi-supervised Text-to-speech Synthesis with Dynamic Quantized Representation (2023) In BDCloud2023
  23. PRCA: Fitting Black-Box Large Language Models for Retrieval Question Answering via Pluggable Reward-Driven Contextual Adapter (2023) In EMNLP2023 (CCF-B)
  24. AOSR-Net: All-in-One Sandstorm Removal Network (2023) In ICTAI2023 (CCF-C)
  25. Contrastive Latent Space Reconstruction Learning for Audio-Text Retrieval (2023) In ICTAI2023 (CCF-C)
  26. FastGraphTTS: An Ultrafast Syntax-Aware Speech Synthesis Framework (2023) In ICTAI2023 (CCF-C)
  27. DiffTalker: Co-driven audio-image diffusion for talking faces via intermediate landmarks (2023) In arXiv (work in progress)
  28. Machine Unlearning Methodology base on Stochastic Teacher Network (2023) In ADMA2023 (CCF-C)
  29. Symbolic and Acoustic: Multi-domain Music Emotion Modeling for Instrumental Music (2023) In ADMA2023 (CCF-C)
  30. Voice Conversion with Denoising Diffusion Probabilistic GAN Models (2023) In ADMA2023 (CCF-C)
  31. Boosting Chinese ASR Error Correction with Dynamic Error Scaling Mechanism (2023) In INTERSPEECH2023 (CCF-C)
  32. EmoMix: Emotion Mixing via Diffusion Models for Emotional Speech Synthesis (2023) In INTERSPEECH2023 (CCF-C)
  33. Investigation of Music Emotion Recognition Based on Segmented Semi-Supervised Learning (2023) In INTERSPEECH2023 (CCF-C)
  34. Prompt Guided Copy Mechanism for Conversational Question Answering (2023) In INTERSPEECH2023 (CCF-C)
  35. SAR: Self-Supervised Anti-Distortion Representation for End-To-End Speech Model (2023) In IJCNN2023 (CCF-C)
  36. Dynamic Alignment Mask CTC: Improved Mask-CTC with Aligned Cross Entropy (2023) In ICASSP2023 (CCF-B)
  37. Improving EEG-based Emotion Recognition by Fusing Time-frequency And Spatial Representations (2023) In ICASSP2023 (CCF-B)
  38. Improving Music Genre Classification from Multi-modal Properties of Music and Genre Correlations Perspective (2023) In ICASSP2023 (CCF-B)
  39. Learning Speech Representations with Flexible Hidden Feature Dimensions (2023) In ICASSP2023 (CCF-B)
  40. QI-TTS: Questioning Intonation Control for Emotional Speech Synthesis (2023) In ICASSP2023 (CCF-B)
  41. VQ-CL: Learning Disentangled Speech Representations with Contrastive Learning and Vector Quantization (2023) In ICASSP2023 (CCF-B)
  42. Adapitch: Adaption Multi-Speaker Text-to-Speech Conditioned on Pitch Disentangling with Untranscribed Data (2022) In MSN2022 (CCF-C)
  43. Improving Imbalanced Text Classification with Dynamic Curriculum Learning (2022) In MSN2022 (CCF-C)
  44. Improving Speech Representation Learning via Speech-level and Phoneme-level Masking Approach (2022) In MSN2022 (CCF-C)
  45. Linguistic-Enhanced Transformer with CTC Embedding for Speech Recognition (2022) In MSN2022 (CCF-C)
  46. MetaSpeech: Speech Effects Switch Along with Environment for Metaverse (2022) In MSN2022 (CCF-C)
  47. Semi-Supervised Learning Based on Reference Model for Low-resource TTS (2022) In MSN2022 (CCF-C)
  48. Shallow Diffusion Motion Model for Talking Face Generation from Speech (2022) In APWeb-WAIM2022 (CCF-C)
  49. Boosting Star-GANs for Voice Conversion with Contrastive Discriminator (2022) In ICONIP2022 (CCF-C)
  50. Pre-Avatar: An Automatic Presentation Generation Framework Leveraging Talking Avatar (2022) In ICTAI2022 (CCF-C)
  51. Speech Representation Disentanglement with Adversarial Mutual Information Learning for One-shot Voice Conversion (2022) In INTERSPEECH2022 (CCF-C)
  52. Tiny-Sepformer: A Tiny Time-Domain Transformer Network For Speech Separation (2022) In INTERSPEECH2022 (CCF-C)
  53. Uncertainty Calibration for Deep Audio Classifiers (2022) In INTERSPEECH2022 (CCF-C)
  54. Adaptive Activation Network for Low Resource Multilingual Speech Recognition (2022) In IJCNN2022 (CCF-C)
  55. MDCNN-SID: Multi-scale Dilated Convolution Network for Singer Identification (2022) In IJCNN2022 (CCF-C)
  56. MetaSID: Singer Identification with Domain Adaptation for Metaverse (2022) In IJCNN2022 (CCF-C)
  57. Singer Identification for Metaverse with Timbral and Middle-Level Perceptual Features (2022) In IJCNN2022 (CCF-C)
  58. Speech Augmentation Based Unsupervised Learning for Keyword Spotting (2022) In IJCNN2022 (CCF-C)
  59. SUSing: SU-net for Singing Voice Synthesis (2022) In IJCNN2022 (CCF-C)
  60. TDASS: Target Domain Adaptation Speech Synthesis Framework for Multi-speaker Low-Resource TTS (2022) In IJCNN2022 (CCF-C)
  61. AVQVC: One-Shot Voice Conversion By Vector Quantization With Applying Contrastive Learning (2022) In ICASSP2022 (CCF-B)
  62. DRVC: A Framework of Any-to-Any Voice Conversion with Self-Supervised Learning (2022) In ICASSP2022 (CCF-B)
  63. nnSpeech: Speaker-Guided Conditional Variational Autoencoder for Zero-Shot Multi-speaker text-to-speech (2022) In ICASSP2022 (CCF-B)
  64. Self-Attention for Incomplete Utterance Rewriting (2022) In ICASSP2022 (CCF-B)
  65. Blur the Linguistic Boundary: Interpreting Chinese Buddhist Sutra in English via Neural Machine Translation (2022) In ICTAI2022 (CCF-C)
  66. Supervised Contrastive Meta-learning for Few-Shot Classification (2022) In HPCC2022 (CCF-C)
  67. VU-BERT: A Unified Framework for Visual Dialog (2022) In ICASSP2022 (CCF-B)
  68. CycleGEAN: Cycle Generative Enhanced Adversarial Network for Voice Conversion (2021) In ASRU2021
  69. Reconstructing Dual Learning for Neural Voice Conversion Using Relatively Few Samples (2021) In ASRU2021
  70. TGAVC: Improving Autoencoder Voice Conversion with Text-Guided and Adversarial Training (2021) In ASRU2021
  71. Dropout Regularization for Self-Supervised Learning of Transformer Encoder Speech Representation (2021) In INTERSPEECH2021 (CCF-C)
  72. Speech2Video: Cross-Modal Distillation for Speech to Video Generation (2021) In INTERSPEECH2021 (CCF-C)
  73. Variational Information Bottleneck for Effective Low-Resource Audio Classification (2021) In INTERSPEECH2021 (CCF-C)
  74. A Language Model Based Pseudo-Sample Deliberation for Semi-supervised Speech Recognition (2021) In IJCNN2021 (CCF-C)
  75. CACnet: Cube Attentional CNN for Automatic Speech Recognition (2021) In IJCNN2021 (CCF-C)
  76. Loss Prediction: End-to-End Active Learning Approach For Speech Recognition (2021) In IJCNN2021 (CCF-C)
  77. Transfer Ability of Monolingual Wav2vec2.0 for Low-resource Speech Recognition (2021) In IJCNN2021 (CCF-C)
  78. Cross-Language Transfer Learning and Domain Adaptation for End-to-End Automatic Speech Recognition (2021) In ICME2021 (CCF-B)
  79. LVCNet: Efficient Condition-Dependent Modeling Network for Waveform Generation (2021) In ICASSP2021 (CCF-B)
  80. Unidirectional Memory-Self-Attention Transducer for Online Speech Recognition (2021) In ICASSP2021 (CCF-B)
  81. End-To-End Silent Speech Recognition with Acoustic Sensing (2021) In SLT2021
  82. GraphPB: Graphical Representations of Prosody Boundary in Speech Synthesis (2021) In SLT2021
  83. MelGlow: Efficient Waveform Generative Network Based On Location-Variable Convolution (2021) In SLT2021
  84. Multi-Quartznet: Multi-Resolution Convolution for Speech Recognition with Multi-Layer Feature Fusion (2021) In SLT2021
  85. A Novel Capsule Aggregation Framework for Natural Language Inference (2021) In APWeb-WAIM2021 (CCF-C)
  86. Joint Intent Detection and Slot Filling Based on Continual Learning Model (2021) In ICASSP2021 (CCF-B)
  87. Self-supervised Learning for Semantic Sentence Matching with Dense Transformer Inference Network (2021) In APWeb-WAIM2021 (CCF-C)
  88. Semantic Embedding Graph Convolutional Networks for Multi-label Video Segment Classification (2021) In PAAP2021
  89. Semantic Extraction for Sentence Representation via Reinforcement Learning (2021) In IJCNN2021 (CCF-C)
  90. A Real-Time Robot-Based Auxiliary System for Risk Evaluation of COVID-19 Infection (2020) In INTERSPEECH2020 (CCF-C)
  91. Large-Scale Transfer Learning for Low-Resource Spoken Language Understanding (2020) In INTERSPEECH2020 (CCF-C)
  92. MLNET: An Adaptive Multiple Receptive-Field Attention Neural Network for Voice Activity Detection (2020) In INTERSPEECH2020 (CCF-C)
  93. Prosody Learning Mechanism for Speech Synthesis System Without Text Length Limit (2020) In INTERSPEECH2020 (CCF-C)
  94. Aligntts: Efficient Feed-Forward Text-to-Speech System Without Explicit Alignment (2020) In ICASSP2020 (CCF-B)
  95. GraphTTS: Graph-to-Sequence Modelling in Neural Text-to-Speech (2020) In ICASSP2020 (CCF-B)
  96. Chinese Punctuation Prediction with Adaptive Attention and Dependency Tree (2020) In CCKS2020
  97. Epidemic Guard: A COVID-19 Detection System for Elderly People (2020) In APWeb-WAIM2020 (CCF-C)

Events