ETIP: a lengthy nested NER problem for Chinese insurance policy analysis
- PDF / 2,135,992 Bytes
- 11 Pages / 595.276 x 790.866 pts Page_size
- 80 Downloads / 205 Views
INDUSTRIAL AND COMMERCIAL APPLICATION
ETIP: a lengthy nested NER problem for Chinese insurance policy analysis Lin Sun1 · Kai Zhang1 · Yuxuan Sun1 · Fangsheng Weng1 · Jianwei Zhang1 Received: 18 April 2019 / Accepted: 31 March 2020 © Springer-Verlag London Ltd., part of Springer Nature 2020
Abstract Contract analysis can significantly ease the work for humans using AI techniques. This paper shows a lengthy nested NER problem of element tagging on insurance policy (ETIP). Compared to NER, ETIP deals with not only different types of entities which vary from a short phrase to a long sentence, but also phrase or clause entities that could be nested. We present a novel hybrid framework of deep learning and heuristic filtering method to recognize the lengthy nested elements. First, a convolutional neural network is constructed to obtain good initial candidates of sliding windows with high softmax probability. Then, the concatenation operator on adjacent candidate segments is introduced to create phrase, clause, or sentence candidates. We design an effective voting strategy to resolve the classification conflict of the concatenated candidates and present a theoretical proof of F1-score optimization. In experiments, we have collected a large Chinese insurance contract dataset to test the performance of the proposed method. An extensive set of experiments is performed to investigate how sliding window candidates can work effectively in our filtering and voting strategy. The optimal parameters are determined by statistical analysis of the experimental data. The results show the promising performance of our method in the ETIP problem. Keywords Information extraction · Convolutional neural network · Heuristic method · F1-score optimization
1 Introduction Automatic contract analysis can gain immediate insight into the content of specific contractual documents in legal or financial areas [22]. Compared to the traditional method of manually reviewing hundreds of contracts, it can not only help manage and access contracts but also significantly free knowledge workers from menial, laborious, and often errorprone tasks. The insurance policy is a legal contract that outlines the rights and obligations of the insured and the insurer. * Lin Sun [email protected] * Jianwei Zhang [email protected] Kai Zhang [email protected] Yuxuan Sun [email protected] Fangsheng Weng [email protected] 1
Department of Computer Science, Zhejiang University City College, 51 HuZhou Street, Hangzhou, China
It consists of a wide variety of different types of insurance coverages to meet specific needs, although most insurance policies are somewhat standardized. Understanding various types of insurance coverage is time-consuming and errorprone. This paper shows a problem of element tagging on insurance policy (ETIP). It can automatically convert a massive amount of insurance policies into structural archives for management and comparison. Due to the highlighted information, ETIP can also provide insurance staff valuable insight into pol
Data Loading...