Fast body part segmentation and tracking of neonatal video data using deep learning

  • PDF / 1,643,903 Bytes
  • 13 Pages / 595.224 x 790.955 pts Page_size
  • 30 Downloads / 175 Views

DOWNLOAD

REPORT


ORIGINAL ARTICLE

Fast body part segmentation and tracking of neonatal video data using deep learning Christoph Hoog Antink1 · Joana Carlos Mesquita Ferreira1 · Michael Paul1 · Simon Lyra1 · Konrad Heimann2 · Srinivasa Karthik3 · Jayaraj Joseph3 · Kumutha Jayaraman4 · Thorsten Orlikowsky2 · Mohanasankar Sivaprakasam3 · Steffen Leonhardt1 Received: 29 March 2020 / Accepted: 20 August 2020 © The Author(s) 2020

Abstract Photoplethysmography imaging (PPGI) for non-contact monitoring of preterm infants in the neonatal intensive care unit (NICU) is a promising technology, as it could reduce medical adhesive-related skin injuries and associated complications. For practical implementations of PPGI, a region of interest has to be detected automatically in real time. As the neonates’ body proportions differ significantly from adults, existing approaches may not be used in a straightforward way, and colorbased skin detection requires RGB data, thus prohibiting the use of less-intrusive near-infrared (NIR) acquisition. In this paper, we present a deep learning-based method for segmentation of neonatal video data. We augmented an existing encoderdecoder semantic segmentation method with a modified version of the ResNet-50 encoder. This reduced the computational time by a factor of 7.5, so that 30 frames per second can be processed at 960 × 576 pixels. The method was developed and optimized on publicly available databases with segmentation data from adults. For evaluation, a comprehensive dataset consisting of RGB and NIR video recordings from 29 neonates with various skin tones recorded in two NICUs in Germany and India was used. From all recordings, 643 frames were manually segmented. After pre-training the model on the public adult data, parts of the neonatal data were used for additional learning and left-out neonates are used for cross-validated evaluation. On the RGB data, the head is segmented well (82% intersection over union, 88% accuracy), and performance is comparable with those achieved on large, public, non-neonatal datasets. On the other hand, performance on the NIR data was inferior. By employing data augmentation to generate additional virtual NIR data for training, results could be improved and the head could be segmented with 62% intersection over union and 65% accuracy. The method is in theory capable of performing segmentation in real time and thus it may provide a useful tool for future PPGI applications. Keywords Image processing · Deep learning · Semantic segmentation · Camera-based monitoring · Nicu

 Christoph Hoog Antink

1 Introduction

[email protected] 1

Medical Information Technology (MedIT), Helmholtz-Institute for Biomedical Engineering, RWTH Aachen University, Pauwelsstr. 20, 52074, Aachen, Germany

2

Section of Neonatology, RWTH Aachen University, Pauwelsstr. 30, 52074, Aachen, Germany

3

Department of Electrical Engineering, Indian Institute of Technology, Madras, Chennai 600036, Tamil Nadu, India

4

Saveetha Medical College, Kanchipuram, Saveetha Nagar, Chennai, 602 105, Indi