Jing Xiao

Publications

  1. ACCon: Angle-Compensated Contrastive Regularizer for Deep Regression (2025) In AAAI2025 (CCF-A)
  2. RUNA: Object-level Out-of-Distribution Detection via Regional Uncertainty Alignment of Multimodal Representations (2025) In AAAI2025 (CCF-A)
  3. IDEAW: Robust Neural Audio Watermarking with Invertible Dual-Embedding (2024) In EMNLP2024 (CCF-B)
  4. Enhancing Emotion Prediction and Recognition in Conversation through Fine-Grained Emotional Cue Analysis and Cross-Modal Fusion (2024) In ICIC2024 (CCF-C)
  5. RREH: Reconstruction Relations Embedded Hashing for Semi-Paired Cross-Modal Retrieval (2024) In ICIC2024 (CCF-C)
  6. RSET: Remapping-based Sorting Method for Emotion Transfer Speech Synthesis (2024) In APWeb2024 (CCF-C)
  7. Retrieval-Augmented Audio Deepfake Detection (2024) In ICMR2024 (CCF-B)
  8. CONTUNER: Singing Voice Beautifying with Pitch and Expressiveness Condition (2024) In IJCNN2024 (CCF-C)
  9. EAD-VC: Enhancing Speech Auto-Disentanglement for Voice Conversion with IFUB Estimator and Joint Text-Guided Consistent Learning (2024) In IJCNN2024 (CCF-C)
  10. Efficient Multi-Model Fusion with Adversarial Complementary Representation Learning (2024) In IJCNN2024 (CCF-C)
  11. EfficientASR: Speech Recognition Network Compression via Attention Redundancy and Chunk-Level FFN Optimization (2024) In IJCNN2024 (CCF-C)
  12. Learning Expressive Disentangled Speech Representations with Soft Speech Units and Adversarial Style Augmentation (2024) In IJCNN2024 (CCF-C)
  13. MAIN-VC: Lightweight Speech Representation Disentanglement for One-Shot Voice Conversion (2024) In IJCNN2024 (CCF-C)
  14. QLSC: A Query Latent Semantic Calibrator for Robust Extractive Question Answering (2024) In IJCNN2024 (CCF-C)
  15. Task-Agnostic Decision Transformer for Multi-Type Agent Control with Federated Split Training (2024) In IJCNN2024 (CCF-C)
  16. From Quantity to Quality: Boosting LLM Performance with Self-Guided Data Selection for Instruction Tuning (2024) In NAACL2024 (CCF-B)
  17. Medical Speech Symptoms Classification via Disentangled Representation (2024) In CSCWD2024 (CCF-C)
  18. Gecko: Resource-Efficient and Accurate Queries in Real-Time Video Streams at the Edge (2024) In INFOCOM2024 (CCF-A)
  19. EmoTalker: Emotionally Editable Talking Face Generation via Diffusion Model (2024) In ICASSP2024 (CCF-B)
  20. ED-TTS: Multi-Scale Emotion Modeling Using Cross-Domain Emotion Diarization for Emotional Speech Synthesis (2024) In ICASSP2024 (CCF-B)
  21. Learning Disentangled Speech Representations with Contrastive Learning and Time-Invariant Retrieval (2024) In ICASSP2024 (CCF-B)
  22. INCPrompt: Task-Aware Incremental Prompting for Rehearsal-Free Class-incremental Learning (2024) In ICASSP2024 (CCF-B)
  23. Leveraging Biases in Large Language Models: bias-kNN for Effective Few-Shot Learning (2024) In ICASSP2024 (CCF-B)
  24. P2DT: Mitigating Forgetting in Task-Incremental Learning with Progressive Prompt Decision Transformer (2024) In ICASSP2024 (CCF-B)
  25. Research on Audio Model Generation Technology Based on Hierarchical Federated Framework (2024) In CAAI TIT
  26. Value-Driven Mixed-Precision Quantization for Patch-Based Inference on Microcontrollers (2024) In DATE2024 (CCF-B)
  27. FedET: A Communication-Efficient Federated Class-Incremental Learning Framework Based on Enhanced Transformer (2023) In IJCAI2023 (CCF-A)
  28. GAIA: Delving into Gradient-based Attribution Abnormality for Out-of-distribution Detection (2023) In NeurIPS2023 (CCF-A)
  29. On the Calibration and Uncertainty with Pólya-Gamma Augmentation for Dialog Retrieval Models (2023) In AAAI2023 (CCF-A)
  30. PMVC: Data Augmentation-Based Prosody Modeling for Expressive Voice Conversion (2023) In MM2023 (CCF-A)
  31. Shoggoth: Towards Efficient Edge-Cloud Collaborative Real-Time Video Inference via Adaptive Online Learning (2023) In DAC2023 (CCF-A)
  32. CLN-VC: Text-Free Voice Conversion Based on Fine-Grained Style Control and Contrastive Learning with Negative Samples Augmentation (2023) In SpaCCS2023
  33. CP-EB: Talking Face Generation with Controllable Pose and Eye Blinking Embedding (2023) In ISPA2023 (CCF-C)
  34. DQR-TTS: Semi-supervised Text-to-speech Synthesis with Dynamic Quantized Representation (2023) In BDCloud2023
  35. PRCA: Fitting Black-Box Large Language Models for Retrieval Question Answering via Pluggable Reward-Driven Contextual Adapter (2023) In EMNLP2023 (CCF-B)
  36. VoiceExtender: Short-utterance Text-independent Speaker Verification with Guided Diffusion Model (2023) In ASRU2023
  37. AOSR-Net: All-in-One Sandstorm Removal Network (2023) In ICTAI2023 (CCF-C)
  38. Contrastive Latent Space Reconstruction Learning for Audio-Text Retrieval (2023) In ICTAI2023 (CCF-C)
  39. FastGraphTTS: An Ultrafast Syntax-Aware Speech Synthesis Framework (2023) In ICTAI2023 (CCF-C)
  40. DiffTalker: Co-driven audio-image diffusion for talking faces via intermediate landmarks (2023) In arXiv (work in progress)
  41. EdgeMA: Model Adaptation System for Real-Time Video Analytics on Edge Devices (2023) In ICONIP2023 (CCF-C)
  42. Machine Unlearning Methodology base on Stochastic Teacher Network (2023) In ADMA2023 (CCF-C)
  43. Symbolic and Acoustic: Multi-domain Music Emotion Modeling for Instrumental Music (2023) In ADMA2023 (CCF-C)
  44. Voice Conversion with Denoising Diffusion Probabilistic GAN Models (2023) In ADMA2023 (CCF-C)
  45. Boosting Chinese ASR Error Correction with Dynamic Error Scaling Mechanism (2023) In INTERSPEECH2023 (CCF-C)
  46. EmoMix: Emotion Mixing via Diffusion Models for Emotional Speech Synthesis (2023) In INTERSPEECH2023 (CCF-C)
  47. Investigation of Music Emotion Recognition Based on Segmented Semi-Supervised Learning (2023) In INTERSPEECH2023 (CCF-C)
  48. Prompt Guided Copy Mechanism for Conversational Question Answering (2023) In INTERSPEECH2023 (CCF-C)
  49. SVVAD: Personal Voice Activity Detection for Speaker Verification (2023) In INTERSPEECH2023 (CCF-C)
  50. SAR: Self-Supervised Anti-Distortion Representation for End-To-End Speech Model (2023) In IJCNN2023 (CCF-C)
  51. Dynamic Alignment Mask CTC: Improved Mask-CTC with Aligned Cross Entropy (2023) In ICASSP2023 (CCF-B)
  52. Improving EEG-based Emotion Recognition by Fusing Time-frequency And Spatial Representations (2023) In ICASSP2023 (CCF-B)
  53. Improving Music Genre Classification from Multi-modal Properties of Music and Genre Correlations Perspective (2023) In ICASSP2023 (CCF-B)
  54. Learning Speech Representations with Flexible Hidden Feature Dimensions (2023) In ICASSP2023 (CCF-B)
  55. QI-TTS: Questioning Intonation Control for Emotional Speech Synthesis (2023) In ICASSP2023 (CCF-B)
  56. VQ-CL: Learning Disentangled Speech Representations with Contrastive Learning and Vector Quantization (2023) In ICASSP2023 (CCF-B)
  57. Feature-Rich Audio Model Inversion for Data-Free Knowledge Distillation Towards General Sound Classification (2023) In ICASSP2023 (CCF-B)
  58. Personalized Federated Learning via Gradient Modulation for Heterogeneous Text Summarization (2023) In IJCNN2023 (CCF-C)
  59. SVLDL: Improved Speaker Age Estimation Using Selective Variance Label Distribution Learning (2022) In SLT2022
  60. Adapitch: Adaption Multi-Speaker Text-to-Speech Conditioned on Pitch Disentangling with Untranscribed Data (2022) In MSN2022 (CCF-C)
  61. Improving Imbalanced Text Classification with Dynamic Curriculum Learning (2022) In MSN2022 (CCF-C)
  62. Improving Speech Representation Learning via Speech-level and Phoneme-level Masking Approach (2022) In MSN2022 (CCF-C)
  63. Linguistic-Enhanced Transformer with CTC Embedding for Speech Recognition (2022) In MSN2022 (CCF-C)
  64. MetaSpeech: Speech Effects Switch Along with Environment for Metaverse (2022) In MSN2022 (CCF-C)
  65. Semi-Supervised Learning Based on Reference Model for Low-resource TTS (2022) In MSN2022 (CCF-C)
  66. Shallow Diffusion Motion Model for Talking Face Generation from Speech (2022) In APWeb-WAIM2022 (CCF-C)
  67. Boosting Star-GANs for Voice Conversion with Contrastive Discriminator (2022) In ICONIP2022 (CCF-C)
  68. Pre-Avatar: An Automatic Presentation Generation Framework Leveraging Talking Avatar (2022) In ICTAI2022 (CCF-C)
  69. SpeechEQ: Speech Emotion Recognition based on Multi-scale Unified Datasets and Multitask Learning (2022) In INTERSPEECH2022 (CCF-C)
  70. Tiny-Sepformer: A Tiny Time-Domain Transformer Network For Speech Separation (2022) In INTERSPEECH2022 (CCF-C)
  71. Uncertainty Calibration for Deep Audio Classifiers (2022) In INTERSPEECH2022 (CCF-C)
  72. Adaptive Activation Network for Low Resource Multilingual Speech Recognition (2022) In IJCNN2022 (CCF-C)
  73. MDCNN-SID: Multi-scale Dilated Convolution Network for Singer Identification (2022) In IJCNN2022 (CCF-C)
  74. MetaSID: Singer Identification with Domain Adaptation for Metaverse (2022) In IJCNN2022 (CCF-C)
  75. Singer Identification for Metaverse with Timbral and Middle-Level Perceptual Features (2022) In IJCNN2022 (CCF-C)
  76. Speech Augmentation Based Unsupervised Learning for Keyword Spotting (2022) In IJCNN2022 (CCF-C)
  77. SUSing: SU-net for Singing Voice Synthesis (2022) In IJCNN2022 (CCF-C)
  78. TDASS: Target Domain Adaptation Speech Synthesis Framework for Multi-speaker Low-Resource TTS (2022) In IJCNN2022 (CCF-C)
  79. AVQVC: One-Shot Voice Conversion By Vector Quantization With Applying Contrastive Learning (2022) In ICASSP2022 (CCF-B)
  80. DRVC: A Framework of Any-to-Any Voice Conversion with Self-Supervised Learning (2022) In ICASSP2022 (CCF-B)
  81. nnSpeech: Speaker-Guided Conditional Variational Autoencoder for Zero-Shot Multi-speaker text-to-speech (2022) In ICASSP2022 (CCF-B)
  82. Self-Attention for Incomplete Utterance Rewriting (2022) In ICASSP2022 (CCF-B)
  83. A Fair Federated Learning Framework With Reinforcement Learning (2022) In IJCNN2022 (CCF-C)
  84. A Nearest Neighbor Under-sampling Strategy for Vertical Federated Learning in Financial Domain (2022) In IH&MMSec2022 (CCF-C)
  85. A Privacy-Preserving Subgraph-Level Federated Graph Neural Network via Differential Privacy (2022) In KSEM2022 (CCF-C)
  86. Adaptive Few-Shot Learning Algorithm for Rare Sound Event Detection (2022) In IJCNN2022 (CCF-C)
  87. Adaptive Sparse and Monotonic Attention for Transformer-based Automatic Speech Recognition (2022) In DSAA2022 (CCF-C)
  88. Augmentation-induced Consistency Regularization for Classification (2022) In IJCNN2022 (CCF-C)
  89. Blur the Linguistic Boundary: Interpreting Chinese Buddhist Sutra in English via Neural Machine Translation (2022) In ICTAI2022 (CCF-C)
  90. Cali3F: Calibrated Fast Fair Federated Recommendation System (2022) In IJCNN2022 (CCF-C)
  91. Debias the Black-Box: A Fair Ranking Framework via Knowledge Distillation (2022) In WISE2022 (CCF-C)
  92. DT-SV: A Transformer-based Time-domain Approach for Speaker Verification (2022) In IJCNN2022 (CCF-C)
  93. Efficient Private Set Intersection Based on Functional Encryption (2022) In ICDIS2022
  94. Federated Non-negative Matrix Factorization for Short Texts Topic Modeling with Mutual Information (2022) In IJCNN2022 (CCF-C)
  95. Federated Split BERT for Heterogeneous Text Classification (2022) In IJCNN2022 (CCF-C)
  96. Improving Human Image Synthesis with Residual Fast Fourier Transformation and Wasserstein Distance (2022) In IJCNN2022 (CCF-C)
  97. Learning Invariant Representation and Risk Minimized for Unsupervised Accent Domain Adaptation (2022) In SLT2022
  98. Leveraging Causal Inference for Explainable Automatic Program Repair (2022) In IJCNN2022 (CCF-C)
  99. Machine Unlearning Method Based On Projection Residual (2022) In DSAA2022 (CCF-C)
  100. Micro-Expression Recognition Based on Attribute Information Embedding and Cross-modal Contrastive Learning (2022) In IJCNN2022 (CCF-C)
  101. QSpeech: Low-Qubit Quantum Speech Application Toolkit (2022) In IJCNN2022 (CCF-C)
  102. r-G2P: Evaluating and Enhancing Robustness of Grapheme to Phoneme Conversion by Controlled Noise Introducing and Contextual Information Incorporation (2022) In ICASSP2022 (CCF-B)
  103. RL-MD: A Novel Reinforcement Learning Approach for DNA Motif Discovery (2022) In DSAA2022 (CCF-C)
  104. Supervised Contrastive Meta-learning for Few-Shot Classification (2022) In HPCC2022 (CCF-C)
  105. Towards Speaker Age Estimation With Label Distribution Learning (2022) In ICASSP2022 (CCF-B)
  106. VU-BERT: A Unified Framework for Visual Dialog (2022) In ICASSP2022 (CCF-B)
  107. zkMLaaS: a Verifiable Scheme for Machine Learning as a Service (2022) In GLOBECOM2022 (CCF-C)
  108. CycleGEAN: Cycle Generative Enhanced Adversarial Network for Voice Conversion (2021) In ASRU2021
  109. Reconstructing Dual Learning for Neural Voice Conversion Using Relatively Few Samples (2021) In ASRU2021
  110. TGAVC: Improving Autoencoder Voice Conversion with Text-Guided and Adversarial Training (2021) In ASRU2021
  111. Dropout Regularization for Self-Supervised Learning of Transformer Encoder Speech Representation (2021) In INTERSPEECH2021 (CCF-C)
  112. Speech2Video: Cross-Modal Distillation for Speech to Video Generation (2021) In INTERSPEECH2021 (CCF-C)
  113. Variational Information Bottleneck for Effective Low-Resource Audio Classification (2021) In INTERSPEECH2021 (CCF-C)
  114. CACnet: Cube Attentional CNN for Automatic Speech Recognition (2021) In IJCNN2021 (CCF-C)
  115. Loss Prediction: End-to-End Active Learning Approach For Speech Recognition (2021) In IJCNN2021 (CCF-C)
  116. Cross-Language Transfer Learning and Domain Adaptation for End-to-End Automatic Speech Recognition (2021) In ICME2021 (CCF-B)
  117. LVCNet: Efficient Condition-Dependent Modeling Network for Waveform Generation (2021) In ICASSP2021 (CCF-B)
  118. Unidirectional Memory-Self-Attention Transducer for Online Speech Recognition (2021) In ICASSP2021 (CCF-B)
  119. End-To-End Silent Speech Recognition with Acoustic Sensing (2021) In SLT2021
  120. GraphPB: Graphical Representations of Prosody Boundary in Speech Synthesis (2021) In SLT2021
  121. MelGlow: Efficient Waveform Generative Network Based On Location-Variable Convolution (2021) In SLT2021
  122. Multi-Quartznet: Multi-Resolution Convolution for Speech Recognition with Multi-Layer Feature Fusion (2021) In SLT2021
  123. A Competition of Shape and Texture Bias by Multi-view Image Representation (2021) In PRCV2021 (CCF-C)
  124. A Novel Capsule Aggregation Framework for Natural Language Inference (2021) In APWeb-WAIM2021 (CCF-C)
  125. A Quantitative Metric for Privacy Leakage in Federated Learning (2021) In ICASSP2021 (CCF-B)
  126. Anomaly Removal for Vehicle Energy Consumption in Federated Learning (2021) In IJCNN2021 (CCF-C)
  127. Automatic Joint Optimization of Algorithm-Level Compression and Compiler-Based Acceleration with Reinforcement Learning for DNN in Edge Devices (2021) In IJCNN2021 (CCF-C)
  128. Case Study of Few-Shot Learning in Text Recognition Models (2021) In WISE2021 (CCF-C)
  129. Communication-Memory-Efficient Decentralized Learning For Audio Representation (2021) In IJCNN2021 (CCF-C)
  130. Contrastive Learning for improving End-to-end Speaker Verification (2021) In IJCNN2021 (CCF-C)
  131. Diversified Point Cloud Classification Using Personalized Federated Learning (2021) In IJCNN2021 (CCF-C)
  132. Effective Phase Encoding for End-To-End Speaker Verification (2021) In INTERSPEECH2021 (CCF-C) (Best Student Paper Award)
  133. Efficient Client Contribution Evaluation for Horizontal Federated Learning (2021) In ICASSP2021 (CCF-B)
  134. Enhancing Data-Free Adversarial Distillation with Activation Regularization and Virtual Interpolation (2021) In ICASSP2021 (CCF-B)
  135. Enhancing Neural Architecture Search by Upgrading Weak Components (2021) In IJCNN2021 (CCF-C)
  136. Federated Learning with Dynamic Transformer for Text to Speech (2021) In INTERSPEECH2021 (CCF-C)
  137. ICSpk: Interpretable Complex Speaker Embedding Extractor from Raw Waveform (2021) In INTERSPEECH2021 (CCF-C)
  138. Joint Intent Detection and Slot Filling Based on Continual Learning Model (2021) In ICASSP2021 (CCF-B)
  139. Modeling Without Sharing Privacy: Federated Neural Machine Translation (2021) In WISE2021 (CCF-C)
  140. Neural Architecture Search as Self-assessor in Semi-supervised Learning (2021) In BigData2021 (CCF-C)
  141. Quantum Convolutional Neural Network on Protein Distance Prediction (2021) In IJCNN2021 (CCF-C)
  142. Self-supervised Learning for Semantic Sentence Matching with Dense Transformer Inference Network (2021) In APWeb-WAIM2021 (CCF-C)
  143. Semantic Embedding Graph Convolutional Networks for Multi-label Video Segment Classification (2021) In PAAP2021
  144. Semantic Extraction for Sentence Representation via Reinforcement Learning (2021) In IJCNN2021 (CCF-C)
  145. When Hearing the Voice, Who Will Come to Your Mind (2021) In IJCNN2021 (CCF-C)
  146. A Real-Time Robot-Based Auxiliary System for Risk Evaluation of COVID-19 Infection (2020) In INTERSPEECH2020 (CCF-C)
  147. Large-Scale Transfer Learning for Low-Resource Spoken Language Understanding (2020) In INTERSPEECH2020 (CCF-C)
  148. MLNET: An Adaptive Multiple Receptive-Field Attention Neural Network for Voice Activity Detection (2020) In INTERSPEECH2020 (CCF-C)
  149. Prosody Learning Mechanism for Speech Synthesis System Without Text Length Limit (2020) In INTERSPEECH2020 (CCF-C)
  150. Aligntts: Efficient Feed-Forward Text-to-Speech System Without Explicit Alignment (2020) In ICASSP2020 (CCF-B)
  151. GraphTTS: Graph-to-Sequence Modelling in Neural Text-to-Speech (2020) In ICASSP2020 (CCF-B)
  152. 3D Point Cloud Segmentation for Complex Structure Based on PointSIFT (2020) In PRCV2020 (CCF-C)
  153. A Robust Speaker Clustering Method Based on Discrete Tied Variational Autoencoder (2020) In ICASSP2020 (CCF-B)
  154. An Approach for Neural Machine Translation with Graph Attention Network (2020) In ICCPR2020
  155. Chinese Punctuation Prediction with Adaptive Attention and Dependency Tree (2020) In CCKS2020
  156. D-GHNAS for Joint Intent Classification and Slot Filling (2020) In APWeb-WAIM2020 (CCF-C)
  157. Empirical Studies of Institutional Federated Learning For Natural Language Processing (2020) In EMNLP2020 (CCF-B)
  158. Epidemic Guard: A COVID-19 Detection System for Elderly People (2020) In APWeb-WAIM2020 (CCF-C)
  159. Evolutionary Algorithm Enhanced Neural Architecture Search for Text-Independent Speaker Verification (2020) In INTERSPEECH2020 (CCF-C)
  160. FedSmart: An Auto Updating Federated Learning Optimization Mechanism (2020) In APWeb-WAIM2020 (CCF-C)
  161. IDRiD: Diabetic Retinopathy - Segmentation and Grading Challenge (2020) In MedIA2020 (IF=10.9)
  162. Image Compressed Sensing Using Neural Architecture Search (2020) In BigData2020
  163. Multi-objective Cuckoo Algorithm for Mobile Devices Network Architecture Search (2020) In ICANN2020 (CCF-C)
  164. Network Coding for Federated Learning Systems (2020) In ICONIP2020 (CCF-C)
  165. ParallelNAS: A Parallel and Distributed System for Neural Architecture Search (2020) In HPCC2020 (CCF-C)
  166. Quantization and Knowledge Distillation for Efficient Federated Learning on Edge Devices (2020) In HPCC2020 (CCF-C)
  167. A Deep Learning-Based Method for Vehicle License Plate Recognition in Natural Scene (2019) In APSIPA Transactions on Signal and Information Processing
  168. Composer4Everyone: Automatic Music Generation with Audio Motif (2019) In MIPR2019
  169. Dynamic Student Classiffication on Memory Networks for Knowledge Tracing (2019) In PAKDD2019
  170. Federated Learning of Unsegmented Chinese Text Recognition Model (2019) In ICTAI2019 (CCF-C)
  171. On Probability Calibration of Recurrent Text Recognition Network (2019) In ICONIP2019 (CCF-C)
  172. Performance of Training Sparse Deep Neural Networks on GPUs (2019) In HPEC2019
  173. Who Should Be Invited to My Party: A Size-Constrained k-Core Problem in Social Networks (2019) In JCST2019
  174. A Noise-Robust Self-Adaptive Multitarget Speaker Detection System (2018) In ICPR2018 (CCF-C)
  175. Automated Full Quantification of Left Ventricle with Deep Neural Networks (2018) In STACOM2018
  176. Social Network Monitoring for Bursty Cascade Detection (2018) In TKDD2018 (CCF-B)
  177. Video-Based Pig Recognition with Feature-Integrated Transfer Learning (2018) In CCBR2018
  178. Assistance of Speech Recognition in Noisy Environment with Sentence Level Lip-Reading (2017) In CCBR2017
  179. Prioritized Grid Highway Long Short-Term Memory-Based Universal Background Model for Speaker Verification (2017) In CCBR2017
  180. User Identity Linkage by Latent User Space Modelling (2016) In SIGKDD2016 (CCF-A)

Events