Usr-mtl: an unsupervised sentence representation learning framework with multi-task learning
- PDF / 2,450,263 Bytes
- 16 Pages / 595.224 x 790.955 pts Page_size
- 9 Downloads / 191 Views
Usr-mtl: an unsupervised sentence representation learning framework with multi-task learning Wenshen Xu1 · Shuangyin Li2
· Yonghe Lu3
Accepted: 26 October 2020 © Springer Science+Business Media, LLC, part of Springer Nature 2020
Abstract Developing the utilized intelligent systems is increasingly important to learn effective text representations, especially extract the sentence features. Numerous previous studies have been concentrated on the task of sentence representation learning based on deep learning approaches. However, the present approaches are mostly proposed with the single task or replied on the labeled corpus when learning the embedding of the sentences. In this paper, we assess the factors in learning sentence representation and propose an efficient unsupervised learning framework with multi-task learning (USR-MTL), in which various text learning tasks are merged into the unitized framework. With the syntactic and semantic features of sentences, three different factors to some extent are reflected in the task of the sentence representation learning that is the wording, or the ordering of the neighbored sentences of a target sentence in other words. Hence, we integrate the word-order learning task, word prediction task, and the sentence-order learning task into the proposed framework to attain meaningful sentence embeddings. Here, the process of sentence embedding learning is reformulated as a multi-task learning framework of the sentence-level task and the two word-level tasks. Moreover, the proposed framework is motivated by an unsupervised learning algorithm utilizing the unlabeled corpus. Based on the experimental results, our approach achieves the state-of-theart performances on the downstream natural language processing tasks compared to the popular unsupervised representation learning techniques. The experiments on representation visualization and task analysis demonstrate the effectiveness of the tasks in the proposed framework in creating reasonable sentence representations proving the capacity of the proposed unsupervised multi-task framework for the sentence representation learning. Keywords Sentence representation learning · Multi-task learning · Unsupervised sentence embedding · Word prediction · Word-order learning · Sentence-order learning · Unlabeled corpus
1 Introduction With the increasing development of various intelligence systems, learning text representations attracted a huge Shuangyin Li
[email protected] Wenshen Xu [email protected] Yonghe Lu [email protected] 1
Alibaba Inc. Hangzhou, Zhejiang, China
2
School of Computer Science, South China Normal University, Guangzhou, Guangdong, China
3
School of Information Management, Sun Yat-sen University, Guangzhou, Guangdong, China
deal of interest, especially sentence representation learning in document modeling and query analysis. Sentence representation learning, one of the fundamental tasks in Neural Language Processing aims at encoding the sentence to a fixed-dimensional vector aggregating syntac
Data Loading...