Weakly supervised multilabel classification for semantic interpretation of endoscopy video frames

  • PDF / 1,321,439 Bytes
  • 13 Pages / 595.276 x 790.866 pts Page_size
  • 5 Downloads / 279 Views

DOWNLOAD

REPORT


ORIGINAL PAPER

Weakly supervised multilabel classification for semantic interpretation of endoscopy video frames Michael D. Vasilakakis1 · Dimitris Diamantis1 · Evaggelos Spyrou1 · Anastasios Koulaouzidis2 · Dimtris K. Iakovidis1  Received: 5 January 2018 / Accepted: 17 May 2018 © Springer-Verlag GmbH Germany, part of Springer Nature 2018

Abstract Several studies have addressed the problem of abnormality detection in medical images using computer-based systems. The impact of such systems in clinical practice and in the society can be high, considering that they can contribute to the reduction of medical errors and the associated adverse events. Today, most of these systems are based on binary classification algorithms that are “strongly” supervised, in the sense that the abnormal training images need to be annotated in detail, i.e., with pixel-level annotations indicating the location of the abnormalities. However, this approach usually does not take into account the diversity of the image content, which may include a variety of structures and artifacts. In the context of gastrointestinal video-endoscopy, addressed in this study, the semantics of the normal contents of the endoscopic video frames include normal mucosal tissues, bubbles, debris and the hole of the lumen, whereas the abnormal video frames may include additional semantics corresponding to lesions or blood. This observation motivated us to investigate various multi-label classification methods, aiming to a richer semantic interpretation of the endoscopic images. Among them, an image-saliency enabled bag-of-words approach and a convolutional neural network architecture enabling multi-scale feature extraction (MM-CNN) are presented. Weakly-supervised learning is implemented using only semantic-level annotations, i.e., meaningful keywords, thus, avoiding the need for the resource demanding pixelwise annotation of the training images. Experiments were performed on a diverse set of wireless capsule endoscopy images. The results of the experiments validate that the weakly-supervised multi-label classification can provide enhanced discrimination of the gastrointestinal abnormalities, with MM-CNN method to provide the best performance. Keywords  Endoscopy · Video analysis · Lesion detection · Weakly supervised learning · Multi-label classification · Bag-ofwords · Convolutional neural networks

1 Introduction * Dimtris K. Iakovidis [email protected]; [email protected]; [email protected] Michael D. Vasilakakis [email protected] Dimitris Diamantis [email protected] Evaggelos Spyrou [email protected] Anastasios Koulaouzidis [email protected] 1



Department of Computer Science and Biomedical Informatics, University of Thessaly, Papasiopoulou 2‑4, 35131 Lamia, Greece



Endoscopy Unit, The Royal Infirmary of Edinburgh, Edinburgh, UK

2

Multi-label classification is a special case of data classification, where multiple labels may be assigned to a given instance. One may consider it as a generalization of multiclass classification, which enable