Hybrid Approach to Document Anomaly Detection: An Application to Facilitate RPA in Title Insurance

  • PDF / 2,021,661 Bytes
  • 18 Pages / 595.26 x 841.82 pts (A4) Page_size
  • 88 Downloads / 189 Views

DOWNLOAD

REPORT


d Approach to Document Anomaly Detection: An Application to Facilitate RPA in Title Insurance Abhijit Guha 1,2          Debabrata Samanta 3 1 Data Science Department, CHRIST (Deemed to be University), Bangalore 560029, India 2 First American India Private Ltd., Bangalore 560038, India 3 Computer Science Department, CHRIST (Deemed to be University), Bangalore 560029, India

  Abstract:   Anomaly detection (AD) is an important aspect of various domains and title insurance (TI) is no exception. Robotic process automation (RPA) is taking over manual tasks in TI business processes, but it has its limitations without the support of artificial intelligence (AI) and machine learning (ML). With increasing data dimensionality and in composite population scenarios, the complexity of  detecting  anomalies  increases  and  AD  in  automated  document  management  systems  (ADMS)  is  the  least  explored  domain.  Deep learning, being the fastest maturing technology can be combined along with traditional anomaly detectors to facilitate and improve the RPAs  in  TI.  We  present  a  hybrid  model  for  AD,  using  autoencoders  (AE)  and  a  one-class  support  vector  machine  (OSVM).  In  the present study, OSVM receives input features representing real-time documents from the TI business, orchestrated and with dimensions reduced by AE. The results obtained from multiple experiments are comparable with traditional methods and within a business acceptable range, regarding accuracy and performance. Keywords:     Anomaly  detection,  title  insurance,  autoencoder,  one-class  support  vector  machine  (OSVM),  term  frequency –  inverse document frequency (TF-IDF), robotic process automation, dimensionality reduction.

 

1 Introduction The evolution of artificial intelligence (AI) in the past decade has transformed almost all businesses in terms of the way the business processes are handled. A significant paradigm shift has been noticed with operations now being driven by machines in place of human beings, using robotic process automation (RPA). This enables organizations to drive profitability by reducing waste. The complexity of any automation depends on the level of cognitive intelligence required to perform a task. The difficulty increases when the input data takes the form of images, text, document, speech, video, etc.[1−3] which are considered unstructured, in the world of data science. There are growing appeals for automated document management systems (ADMS) that deal with applications such as search, retrieve, profile and classify in the field of healthcare, education, banking, various types of insurances, and other verticals that deal with a voluminous number of transactions. The pictorial representation of a typical ADMS application is shown in Fig. 1. Title insurance (TI) is a domain of insurance that transacts with documents associated with the property to provide insurance on the title of a subject property to a   Research Article Manuscript received April 5, 2020; accepted July 31, 2020 Recommended by Associate Edi