Identifying forged seal imprints using positive and unlabeled learning

  • PDF / 1,539,244 Bytes
  • 13 Pages / 439.37 x 666.142 pts Page_size
  • 76 Downloads / 173 Views

DOWNLOAD

REPORT


Identifying forged seal imprints using positive and unlabeled learning Leiming Yan 1,2 & Kai Chen 1,2 & Shikun Tong 1,2 & Jinwei Wang 1,2 & Zhen Chen 3 Received: 29 June 2020 / Revised: 7 September 2020 / Accepted: 10 November 2020 # Springer Science+Business Media, LLC, part of Springer Nature 2020

Abstract

Nowadays with the development of photosensitive seal technology, the seal fraud events have gradually increased. Forged seals can bring considerable benefits to counterfeiters, and will also bring huge losses to companies and users. Since it is almost impossible to collect enough forged seal samples, traditional machine learning methods do not work in this situation. In this paper, a method based on PU learning and distance learning is proposed. This method uses a limited number of labeled samples and some unlabeled samples to train multiple kNN classifiers to identify forged seal imprints, and use distance learning to improve the performance of kNN classifiers. The experimental results show that the F1-score of the proposed method can reach 0.97 regardless of the seal imprints with lots of text background noise, which outweighs many traditional models. Keywords PU learning . Metric learning . Photosensitive seal

* Jinwei Wang [email protected] Leiming Yan [email protected] Kai Chen [email protected] Shikun Tong [email protected] Zhen Chen [email protected]

1

School of Computer & Software, Nanjing University of Information Science & Technology, Nanjing, China

2

Engineering Research Center of Digital Forensics, Ministry of Education, Nanjing, China

3

JiangSu QunJie IOT Technology Co., Ltd, Nanjing, China

Multimedia Tools and Applications

1 Introduction In Asia, seals are widely used to identify contracts, financial documents, personal identities, etc. Seals play crucial role in financial industry especially. For example, bank staffs frequently check and verify whether a seal imprint is real or faked every day. The traditional methods of seal imprint verification include manual origami angle discrimination and human eye recognition. These methods, which consume manpower and achieve low accuracy, hardly identify those faked photosensitive seals which are made by computer not by manual and are almost as same as the real one. Many scholars have used support vector machines and deep learning techniques to automatically verify seal imprints, however in actual application scenes, enough negative samples are difficult to be obtained. The forged seals may be from multiple sources and in various styles, and the forgery is very realistic and difficult to identify manually. Thus, it is difficult to collect enough faked seal samples for training satisfactory machine learning models. It is also impossible to enumerate all kinds of fake seals even if you could make many faked samples for classifiers training. Moreover, because the proportion of negative samples is too small, the imbalance distribution of positive and negative samples will cause traditional machine learning classifiers to learn biased decision boun