Language-based translation and prediction of surgical navigation steps for endoscopic wayfinding assistance in minimally

  • PDF / 1,716,315 Bytes
  • 12 Pages / 595.276 x 790.866 pts Page_size
  • 61 Downloads / 167 Views

DOWNLOAD

REPORT


ORIGINAL ARTICLE

Language-based translation and prediction of surgical navigation steps for endoscopic wayfinding assistance in minimally invasive surgery Richard Bieck1

· Katharina Heuermann2 · Markus Pirlich2 · Juliane Neumann1 · Thomas Neumuth1

Received: 13 January 2020 / Accepted: 14 September 2020 © The Author(s) 2020

Abstract Purpose In the context of aviation and automotive navigation technology, assistance functions are associated with predictive planning and wayfinding tasks. In endoscopic minimally invasive surgery, however, assistance so far relies primarily on image-based localization and classification. We show that navigation workflows can be described and used for the prediction of navigation steps. Methods A natural description vocabulary for observable anatomical landmarks in endoscopic images was defined to create 3850 navigation workflow sentences from 22 annotated functional endoscopic sinus surgery (FESS) recordings. Resulting FESS navigation workflows showed an imbalanced data distribution with over-represented landmarks in the ethmoidal sinus. A transformer model was trained to predict navigation sentences in sequence-to-sequence tasks. The training was performed with the Adam optimizer and label smoothing in a leave-one-out cross-validation study. The sentences were generated using an adapted beam search algorithm with exponential decay beam rescoring. The transformer model was compared to a standard encoder-decoder-model, as well as HMM and LSTM baseline models. Results The transformer model reached the highest prediction accuracy for navigation steps at 0.53, followed by 0.35 of the LSTM and 0.32 for the standard encoder-decoder-network. With an accuracy of sentence generation of 0.83, the prediction of navigation steps at sentence-level benefits from the additional semantic information. While standard class representation predictions suffer from an imbalanced data distribution, the attention mechanism also considered underrepresented classes reasonably well. Conclusion We implemented a natural language-based prediction method for sentence-level navigation steps in endoscopic surgery. The sentence-level prediction method showed a potential that word relations to navigation tasks can be learned and used for predicting future steps. Further studies are needed to investigate the functionality of path prediction. The prediction approach is a first step in the field of visuo-linguistic navigation assistance for endoscopic minimally invasive surgery. Keywords Natural language processing · Endoscopic navigation · Machine translation · Workflow prediction · Deep learning · Attention networks · FESS

Introduction Minimally invasive endoscopic surgery is valued as a standard in surgical practice, because with this method patient’s

B

Richard Bieck [email protected]

1

Innovation Center Computer Assisted Surgery (ICCAS), Leipzig University, Semmelweisstraße 14, 04103 Leipzig, Germany

2

Department for Ear-, Nose- and Throat-Surgery, University of Leipzig Medical Center, Leip