Semantic Extraction for Sentence Representation via Reinforcement Learning

Abstract

Many modern Natural Language Processing(NLP) systems rely on word embeddings, previously trained in an unsupervised manner on large corpora, as base features. Several attempts of learning unsupervised sentence representations have not achieved satisfactory performance and have not been widely adopted. In this work, we present a Semantic Extraction Reinforcement learning Model(SERM) for encoding sentences into embedding vectors, which transforms the learning process into intent detection and named entity recognition tasks. Intent detection is mainly related to the semantics of the whole sentence, while named entity recognition pays more attention to local entities. Given a sentence, SERM builds a sentence representation by extracting the most important words and removing irrelevant words in a sentence. Unlike the mainstream approach, SERM compresses sentences from semantic and entity perspectives and this allows us to efficiently learn different types of encoding functions. The experimental results show that SERM can learn high quality sentence representation. This paper demonstrates that our sentence representations have sufficient competitiveness with the best performing model on text classification task.

Type
Publication
International Joint Conference on Neural Networks
Click the Cite button above to demo the feature to enable visitors to import publication metadata into their reference management software.
Ning Cheng
Ning Cheng
Researcher