Bidirectional long short-term memory for surgical skill classification of temporally segmented tasks

PDF / 775,559 Bytes
10 Pages / 595.276 x 790.866 pts Page_size
6 Downloads / 196 Views

ORIGINAL ARTICLE

Bidirectional long short-term memory for surgical skill classification of temporally segmented tasks Jason D. Kelly1

· Ashley Petersen2 · Thomas S. Lendvay3 · Timothy M. Kowalewski1

Received: 13 March 2020 / Accepted: 23 September 2020 © CARS 2020

Abstract Purpose The majority of historical surgical skill research typically analyzes holistic summary task-level metrics to create a skill classification for a performance. Recent advances in machine learning allow time series classification at the sub-task level, allowing predictions on segments of tasks, which could improve task-level technical skill assessment. Methods A bidirectional long short-term memory (LSTM) network was used with 8-s windows of multidimensional timeseries data from the Basic Laparoscopic Urologic Skills dataset. The network was trained on experts and novices from four common surgical tasks. Stratified cross-validation with regularization was used to avoid overfitting. The misclassified cases were re-submitted for surgical technical skill assessment to crowds using Amazon Mechanical Turk to re-evaluate and to analyze the level of agreement with previous scores. Results Performance was best for the suturing task, with 96.88% accuracy at predicting whether a performance was an expert or novice, with 1 misclassification, when compared to previously obtained crowd evaluations. When compared with expert surgeon ratings, the LSTM predictions resulted in a Spearman coefficient of 0.89 for suturing tasks. When crowds re-evaluated misclassified performances, it was found that for all 5 misclassified cases from peg transfer and suturing tasks, the crowds agreed more with our LSTM model than with the previously obtained crowd scores. Conclusion The technique presented shows results not incomparable with labels which would be obtained from crowdsourced labels of surgical tasks. However, these results bring about questions of the reliability of crowd sourced labels in videos of surgical tasks. We, as a research community, should take a closer look at crowd labeling with higher scrutiny, systematically look at biases, and quantify label noise. Keywords Surgical skill · Crowd sourcing · Bidirectional LSTM · Surgical technical skill · Machine learning

Introduction Computationally assessing the skill of a surgeon in an objective manner using tool motion has proven a complex problem with many challenges. Previous research has relied mostly on summary performance metrics from kinematic data [1– 3]. Unfortunately, these metrics typically failed to completely discriminate novices from experts, that is to never misclas-

B

Jason D. Kelly [email protected]

1

Department of Mechanical Engineering, University of Minnesota, Minneapolis, MN, USA

2

Division of Biostatistics, University of Minnesota, Minneapolis, MN, USA

3

Department of Urology, Seattle Children’s Hospital, Seattle, WA, USA

sify “obvious” novices vs. “obvious” experts—the so-called minimally acceptable classifier (MAC) criterion [4]. Recent advances in machine learning techniques

Data Loading...

Bidirectional long short-term memory for surgical skill classification of temporally segmented tasks

Recommend Documents

Classification Scheme for Modelling Tasks

Towards Improved Detection of Cognitive Performance Using Bidirectional Multilayer Long-Short Term Memory Neural Network

Extensive hotel reviews classification using long short term memory

Twitter Moral Stance Classification Using Long Short-Term Memory Networks

Handwritten Bangla Character Recognition Using Convolutional Neural Network and Bidirectional Long Short-Term Memory

Heart biometrics based on ECG signal by sparse coding and bidirectional long short-term memory

Classification of Criminal News Over Time Using Bidirectional LSTM

Temporally Weak

Towards Accurate and Interpretable Surgical Skill Assessment: A Video-Based Method Incorporating Recognized Surgical Ges

Candidate Classification and Skill Recommendation in a CV Recommender System

Filter bank temporally local canonical correlation analysis for short time window SSVEPs classification

Classification of EEG Features for Prediction of Working Memory Load