SynCGAN: Using Learnable Class Specific Priors to Generate Synthetic Data for Improving Classifier Performance on Cytolo

One of the most challenging aspects of medical image analysis is the lack of a high quantity of annotated data. This makes it difficult for deep learning algorithms to perform well due to a lack of variations in the input space. While generative adversari

  • PDF / 3,578,699 Bytes
  • 11 Pages / 439.37 x 666.142 pts Page_size
  • 105 Downloads / 190 Views

DOWNLOAD

REPORT


Abstract. One of the most challenging aspects of medical image analysis is the lack of a high quantity of annotated data. This makes it difficult for deep learning algorithms to perform well due to a lack of variations in the input space. While generative adversarial networks have shown promise in the field of synthetic data generation, but without a carefully designed prior the generation procedure can not be performed well. In the proposed approach we have demonstrated the use of automatically generated segmentation masks as learnable class-specific priors to guide a conditional GAN for the generation of patho-realistic samples for cytology image. We have observed that augmentation of data using the proposed pipeline called “SynCGAN” improves the performance of state of the art classifiers such as ResNet-152, DenseNet-161, Inception-V3 significantly.

Keywords: Conditional generative adversarial networks (CGAN) Synthetic data generation · Cytology image classification · Deep learning

1

·

Introduction

The modern machine learning algorithms such as deep learning, have been greatly dependent on the availability of a large amount of high-quality data. But for various niche domains such as medical imaging large quantities of data are generally unavailable due to various constraints, such as lack of patients, infrastructural inadequacy, noisy environments, lack of experts for annotations and so on. However, with the advent of generative adversarial networks (GANs) [6], an avenue for high quality data generation has opened. In its base form, GANs c Springer Nature Singapore Pte Ltd. 2020  R. V. Babu et al. (Eds.): NCVPRIPG 2019, CCIS 1249, pp. 32–42, 2020. https://doi.org/10.1007/978-981-15-8697-2_3

SynCGAN

33

are capable of generating samples from a randomly sampled prior which demonstrates likeliness to a predefined data distribution. However, without proper guidance, the generation process can result in eccentric outputs. However, conditional GANs (CGANs) [12], on the other hand, use a semantically sensible prior for guiding the data generation process to generate more accurate and meaningful samples. In the proposed work, we explore the ability of CGANs to work with learnable priors for efficient data generation to improve classifier performance on cytology images. In most practical cases the number of available data samples is too limited for deep learning approaches to thrive. Thus data augmentation serves as a primary tool for improving learning ability. Though annotating pixel specific masks for cytology images is a difficult and expensive job, however, with adequate expertise and a decent amount of labor it is possible to annotate at least a small batch of samples for a better semantic representation. The proposed approach makes use of such semantic masks to serve as a prior for CGANs. While generating fully detailed cytology images without priors is much difficult, the generation of segmentation masks from scratch is a much simpler task given that the output distribution is binomial. Our proposed approach makes use o