News

[01/02/2024] $\bullet$ Great news! We are excited to announce that our latest research submission to CSCWD 2024 has been accepted. The 2024 27th International Conference on Computer Supported Cooperative Work in Design (CSCWD 2024) serves as a platform for researchers and practitioners across diverse domains to present their findings and engage in discussions about crucial issues. The conference’s scope encompasses the research and development of collaborative technologies and their applications in designing processes, products, systems, and services across various industries and societies. The accepted work “Medical Speech Symptoms Classification via Disentangled Representation.” The contribution reflect our commitment to advancing collaboration technologies, exploring innovative methods, and addressing key challenges in diverse fields such as human-computer interaction, business process management, collaborative virtual environments, enterprise modeling, security and privacy, as well as social aspects and human factors associated with collaboration and design. We look forward to participating in CSCWD 2024 and contributing to the vibrant discussions and advancements in the field of computer-supported cooperative work in design.

[24/01/2024] $\bullet$ Exciting News: Our Paper on Hierarchical Federated Framework for Audio Model Generation Technology Accepted by CAAI Transactions on Intelligent Systems. We are thrilled to announce that our research paper, titled “Research on Audio Model Generation Technology Based on Hierarchical Federated Framework,” has been accepted for publication in the prestigious journal, CAAI Transactions on Intelligent Systems. The journal is currently in the process of scheduling the publication date for our groundbreaking work. The focal point of our study centers around audio models, delving into the exploration of next-generation audio generation techniques. The primary objective is to construct a federated audio model training framework that facilitates audio representation learning on a massively scaled audio dataset. This framework aims to provide efficient and robust solutions for various downstream audio tasks. We eagerly anticipate the publication of our paper in CAAI Transactions on Intelligent Systems and look forward to sharing our findings with the broader scientific community.

[13/12/2023] $\bullet$ Breaking news: We are delighted to announce that our team has six papers accepted by ICASSP 2024, according to a preliminary list of accepted papers. ICASSP is the top conference in the field of speech and signal processing, and we congratulate our team for their outstanding achievements at ICASSP. For more details, please refer to the official acceptance notification.

[10/12/2023] $\bullet$ Jianzong Wang, the Honorary Director of the Laboratory, has been awarded the Outstanding Reviewer Award at the 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP 2023). This prestigious award recognizes his excellent contributions to the commnunity by providing high-quality and efficient reviews of competitive paper and symposium submissions for the conference program. EMNLP 2023 is one of the leading conferences in the field of natural language processing, attracting researchers from all over the world to present and discuss their latest findings and innovations. The Outstanding Reviewer Award is given to those reviewers who have demonstrated the highest standards of rigor, relevance, and constructive feedback in their reviews. Jianzong Wang is among the few selected reviewers who have received this honor, which reflects his expertise, dedication, and professionalism in advancing the scientific communication. We congratulate Jianzong Wang on this remarkable achievement and thank him for his valuable service to the commnunity.

[01/12/2023] $\bullet$ We are thrilled to share the fantastic news that our latest paper, titled “Gecko: Resource-Efficient and Accurate Queries in Real-Time Video Streams at the Edge,” has been successfully accepted for inclusion in the technical program of the prestigious IEEE INFOCOM 2024 conference. This achievement not only underscores the dedication and hard work invested in our research but also highlights the significance of our findings in the realm of real-time video stream analysis at the Edge. The acceptance rate for this conference stands at an impressive 19%, further emphasizing the caliber and innovation encapsulated in our work. We extend our heartfelt gratitude to everyone involved in the development of this paper and look forward to the opportunity to present and share our insights with the global community of researchers and professionals at IEEE INFOCOM 2024.

[08/11/2023] $\bullet$ Breaking News: The CCF-B category in Computer Architecture has just announced the paper acceptance results for the prestigious Design, Automation and Test in Europe (DATE) 2024 conference. The DATE conference is recognized as one of the top four leading conferences in the field of Electronic Design Automation (EDA). The selection process for the DATE conference was exceptionally competitive, as evidenced by the 996 valid research paper submissions received this year. The Technical Program Committee meticulously reviewed approximately 4,000 submissions over the course of almost two weeks, engaging in in-depth discussions before convening at the TPC meeting to finalize decisions. Ultimately, only 25% of all submissions were accepted as regular papers. We are thrilled to announce that our team has received the honor of having a Regular Paper accepted. The paper is titled “Value-Driven Mixed-Precision Quantization for Patch-Based Inference on Microcontrollers.” This acknowledgment highlights the team’s significant contribution to the ever-evolving landscape of computer architecture. The DATE conference remains a key platform for showcasing cutting-edge technological advancements in the EDA domain.

[18/10/2023] $\bullet$ In the just-released acceptance notifications for the 21st IEEE International Symposium on Parallel and Distributed Processing with Applications (IEEE ISPA 2023), we are pleased to announce the acceptance of three research papers: 1. “CP-EB: Talking Face Generation with Controllable Pose and Eye Blinking Embedding” a novel method for generating talking faces. It takes an audio signal and a reference person image to synthesize photorealistic videos with controllable head poses and proper eye blinking. The method employs a GAN-based architecture to extract eye blink features from audio and reference video, followed by contrastive training for embedding them into identity and pose features, resulting in realistic talking face images. 2. “DQR-TTS: Semi-supervised Text-to-speech Synthesis with Dynamic Quantized Representation” presents a semi-supervised text-to-speech synthesis model that leverages both paired and unpaired data. A key component is the dynamic quantized representation module, integrated into a sequential autoencoder. It learns quantized representations from paired data, but due to limited resources, unpaired data is employed to expand the codebook. The model’s innovation lies in its ability to cover a wide range of phonemes in low-resource scenarios. 3. “CLN-VC: Text-Free Voice Conversion Based on Fine-Grained Style Control and Contrastive Learning with Negative Samples Augmentation” addresses voice conversion challenges by proposing augmented negative sample selection. It introduces hard negative samples using a speaker fusion module to enhance the learning of speaker encoders. Additionally, it emphasizes fine-grained style modeling by employing a reference encoder to extract style and applying augmented contrastive learning on global style.

[08/10/2023] $\bullet$ EMNLP 2023 Accepts Groundbreaking Research on Large Language Models. We are thrilled to announce that our team’s research on large language models (LLMs) has been accepted for presentation at the main conference of EMNLP 2023, marking a significant milestone in our journey in the field of large language model research. The paper titled “PRCA: Fitting Black-Box Large Language Models for Retrieval Question Answering via Pluggable Reward-Driven Contextual Adapter” addresses the challenges of integrating LLMs into the Retrieval Question Answering (ReQA) task, which employs a retrieval-augmented framework consisting of a retriever and a generator. In ReQA, generators formulate answers based on documents retrieved by the retriever. While LLMs offer advanced QA capabilities, they are often too large for fine-tuning within budget constraints, and some are only accessible via APIs. To overcome these challenges and enhance ReQA performance, our team has proposed a groundbreaking solution - the trainable Pluggable Reward-Driven Contextual Adapter (PRCA), which treats the generator as a black box. Positioned between the retriever and generator in a pluggable manner, PRCA refines the retrieved information by operating with a token-autoregressive strategy, maximizing rewards during the reinforcement learning phase. Our experiments have demonstrated PRCA’s remarkable effectiveness in improving ReQA performance across three different datasets, with performance gains of up to 20%. This innovative approach allows us to seamlessly integrate black-box LLMs into existing frameworks, showcasing its tremendous potential in the era of large language models.

[22/09/2023] $\bullet$ Breaking News: Groundbreaking Speaker Verification Research Accepted at ASRU 2023. In a remarkable achievement, we have successfully had our latest work accepted at ASRU 2023, one of the most prestigious conferences in the field of Automatic Speech Recognition and Understanding. The paper titled “VoiceExtender: Short-utterance Text-independent Speaker Verification with Guided Diffusion Model” presents a revolutionary solution for enhancing Speaker Verification (SV) performance, particularly when dealing with short-duration speech signals. The impact of VoiceExtender is nothing short of remarkable. Extensive experimentation on the Voxceleb1 dataset has yielded astounding results. In comparison to the baseline, VoiceExtender has demonstrated substantial performance improvements, with relative enhancements in the Equal Error Rate (EER).

[22/09/2023] $\bullet$ NeurIPS 2023 News Flash: Paper Accepted - “GAIA: Delving into Gradient-based Attribution Abnormality for Out-of-distribution Detection”. We are thrilled to announce that our latest research paper has been accepted for presentation at NeurIPS 2023. NeurIPS, a prestigious A-class international conference, accepted only 26.1% of the 12,343 submissions, making this achievement truly remarkable. In our paper, we present a novel perspective on quantifying disparities between in-distribution (ID) and Detecting out-of-distribution (OOD) data by examining the uncertainty that arises when models attempt to explain their predictive decisions. Our motivation stems from the observation that gradient-based attribution methods face challenges when assigning feature importance to OOD data, resulting in significantly divergent explanation patterns. Consequently, we explore how attribution gradients lead to uncertain explanation outcomes and introduce two types of abnormalities for OOD detection: the zero-deflation abnormality and the channel-wise average abnormality. To address these challenges, we propose GAIA, a straightforward and effective approach that incorporates Gradient Abnormality Inspection and Aggregation. Remarkably, GAIA can be applied to pre-trained models without the need for additional fine-tuning or training. Our experimental results demonstrate that GAIA outperforms state-of-the-art methods on both widely used benchmarks such as CIFAR and large-scale datasets like ImageNet. We are excited about the potential impact of our research in enhancing OOD detection for deep neural networks, and we look forward to presenting our findings at NeurIPS 2023.

[04/09/2023] $\bullet$ Acceptance of Paper - Good News! One paper about sandstorm image enhancement method named AOSR-Net: All-in-One Sandstorm Removal Network was accepeted by ICTAI 2023. It establishes the image-to-mapping relationship directly by incorporating intermediate parameters. Another paper about Cross-modal retrieval (CMR) named Contrastive Latent Space Reconstruction Learning for Audio-Text Retrieval was accepeted by ICTAI 2023. Enhancing modal interaction in audio-text cross-modal retrieval (CMR) is achieved by integrating latent representation reconstruction modules into the CMR framework.

[04/09/2023] $\bullet$ After GraphTTS and GraphPB, our lab’s Graph series of papers continues with the third installment: “FastGraphTTS: An Ultrafast Syntax-Aware Speech Synthesis Framework” which has just been accepted for presentation at ICTAI 2023. In this work, we incorporate graph-to-sequence techniques into an end-to-end text-to-speech framework to facilitate syntax-aware modeling using the syntactic information from the input text. The model’s efficiency has been significantly enhanced through the utilization of an AI chip operator designed for 5x acceleration.

[28/08/2023] $\bullet$ LLAM is delighted to wholeheartedly welcome aboard its two latest researchers, Botao Zhao and Jianhan Wu. With great anticipation for the substantial contributions they are poised to make, the Honorary Director, Jianzong Wang, foresees the far-reaching impact they will have. Both researchers have expressed their eagerness, recognizing this as a prime opportunity to collaborate with exceptional minds. This induction further underscores LLAM’s unwavering commitment to pioneering research and fostering innovation.

[25/08/2023] $\bullet$ Our paper on federated learning has been accepted by IJCAI 2023, titled “FedET: A Communication-Efficient Federated Class-Incremental Learning Framework Based on Enhanced Transformer.” IJCAI, the International Joint Conference on Artificial Intelligence, is one of the most significant academic conferences in the field of artificial intelligence. It attracts over a thousand participants from academia and industry around the world each year. The Chinese Computer Federation (CCF) has classified IJCAI as a Class A conference in the field of artificial intelligence in its list of recommended international academic conferences.

[21/08/2023] $\bullet$ Dr. Jianzong Wang Attends INTERSPEECH 2023 International Conference in Dublin. INTERSPEECH is the largest and most comprehensive conference on spoken language processing.

[26/07/2023] $\bullet$ New paper of PMVC: Data Augmentation-Based Prosody Modeling for Expressive Voice Conversion published on the 31st ACM International Conference on Multimedia.