Contrastive Latent Space Reconstruction Learning for Audio-Text Retrieval

Kaiyi Luo, Xulong Zhang, Jianzong Wang, Huaxiong Li, Ning Cheng, Jing Xiao

September 2023

The overall workflow of CLSR

Abstract

Cross-modal retrieval (CMR) has been widely applied in a wide range of applications, such as multimedia search engines, recommendation systems. Most existing CMR methods focus on image-to-text retrieval, whereas audio-to-text retrieval, a less explored domain, has posed a great challenge due to the difficulty to uncover discriminative features from audio clips and texts. Existing studies are restricted in the following two ways{:} 1) Most researchers utilize contrastive learning to construct a common subspace where similarities among data can be measured. However, they considers only cross-modal transformation, neglecting the intra-modal separability. Besides, the temperature parameter is not adaptively adjusted along with semantic guidance, which degrades the performance. 2) These methods do not take latent representation reconstruction into account, which is essential for semantic alignment. In this paper, we propose a novel method for audio-text cross-modal retrieval, termed Contrastive Latent Space Reconstruction Learning (CLSR). CLSR improves contrastive representation learning by taking intra-modal separability into account and adopting an adaptive temperature control strategy. Moreover, the latent representation reconstruction modules are embedded into the CMR framework, which improves modal interaction. Experiments on two audio-text datasets demonstrate that our approach outperforms some state-of-the-art methods.

Type

Publication

In 2023 IEEE 35th International Conference on Tools with Artificial Intelligence

Click the Cite button above to demo the feature to enable visitors to import publication metadata into their reference management software.

Contrastive Latent Space Reconstruction Learning for Audio-Text Retrieval

Abstract

Kaiyi Luo

Nanjing University

Xulong Zhang

Executive Director

Jianzong Wang

Honorary Director