Hybrid Approach to Document Anomaly Detection: An Application to Facilitate RPA in Title Insurance
- PDF / 2,021,661 Bytes
- 18 Pages / 595.26 x 841.82 pts (A4) Page_size
- 88 Downloads / 189 Views
d Approach to Document Anomaly Detection: An Application to Facilitate RPA in Title Insurance Abhijit Guha 1,2 Debabrata Samanta 3 1 Data Science Department, CHRIST (Deemed to be University), Bangalore 560029, India 2 First American India Private Ltd., Bangalore 560038, India 3 Computer Science Department, CHRIST (Deemed to be University), Bangalore 560029, India
Abstract: Anomaly detection (AD) is an important aspect of various domains and title insurance (TI) is no exception. Robotic process automation (RPA) is taking over manual tasks in TI business processes, but it has its limitations without the support of artificial intelligence (AI) and machine learning (ML). With increasing data dimensionality and in composite population scenarios, the complexity of detecting anomalies increases and AD in automated document management systems (ADMS) is the least explored domain. Deep learning, being the fastest maturing technology can be combined along with traditional anomaly detectors to facilitate and improve the RPAs in TI. We present a hybrid model for AD, using autoencoders (AE) and a one-class support vector machine (OSVM). In the present study, OSVM receives input features representing real-time documents from the TI business, orchestrated and with dimensions reduced by AE. The results obtained from multiple experiments are comparable with traditional methods and within a business acceptable range, regarding accuracy and performance. Keywords: Anomaly detection, title insurance, autoencoder, one-class support vector machine (OSVM), term frequency – inverse document frequency (TF-IDF), robotic process automation, dimensionality reduction.
1 Introduction The evolution of artificial intelligence (AI) in the past decade has transformed almost all businesses in terms of the way the business processes are handled. A significant paradigm shift has been noticed with operations now being driven by machines in place of human beings, using robotic process automation (RPA). This enables organizations to drive profitability by reducing waste. The complexity of any automation depends on the level of cognitive intelligence required to perform a task. The difficulty increases when the input data takes the form of images, text, document, speech, video, etc.[1−3] which are considered unstructured, in the world of data science. There are growing appeals for automated document management systems (ADMS) that deal with applications such as search, retrieve, profile and classify in the field of healthcare, education, banking, various types of insurances, and other verticals that deal with a voluminous number of transactions. The pictorial representation of a typical ADMS application is shown in Fig. 1. Title insurance (TI) is a domain of insurance that transacts with documents associated with the property to provide insurance on the title of a subject property to a Research Article Manuscript received April 5, 2020; accepted July 31, 2020 Recommended by Associate Edi
Data Loading...