Identifying diagnosis evidence of cardiogenic stroke from Chinese echocardiograph reports

  • PDF / 1,827,746 Bytes
  • 11 Pages / 595.276 x 790.866 pts Page_size
  • 66 Downloads / 183 Views

DOWNLOAD

REPORT


RESEARCH

Open Access

Identifying diagnosis evidence of cardiogenic stroke from Chinese echocardiograph reports Lu Qin1†, Xiaowei Xu1†, Lingling Ding2, Zixiao Li2* and Jiao Li1* From 5th China Health Information Processing Conference Guangzhou, China. 22-24 November 2019

Abstract Background: Cardiogenic stroke has increasing morbidity in China and brought economic burden to patient families. In cardiogenic stroke diagnosis, echocardiograph examination is one of the most important examinations. Sonographers will investigate patients’ heart via echocardiograph, and describe them in the echocardiograph reports. In this study, we developed a machine learning model to automatically identify diagnosis evidences of cardiogenic stroke providing to neurologist for clinical decision making. Methods: We collected 4188 Chinese echocardiograph reports of 4018 patients, with average length 177 Chinese characters in free-text style. Collaborating with neurologists and sonographers, we summarized 149 phrases on diagnosis evidence of cardiogenic stroke such as “二尖瓣重度狭窄” (severe mitral stenosis), “主动脉瓣退行性变” (aortic valve degeneration) and so on. Furthermore, we developed an annotated corpus via mapping 149 phrases to the 4188 reports. We selected 11 most frequent diagnosis evidence types such as “二尖瓣狭窄” (mitral stenosis) for further identifying. The generated corpus is divided into training set and testing set in the ratio of 8:2, which is used to train and validate a machine learning model to identify the evidence of cardiogenic stroke using BiLSTMCRF algorithm. Results: Our machine learning method achieved the average performance on the diagnosis evidence identification is 98.03, 90.17 and 93.94% respectively. In addition, our method is capable to identify the novel diagnosis evidence of cardiogenic stroke description such as “二尖瓣中-重度狭窄” (mitral stenosis), “主动脉瓣退行性病变” (aortic valve calcification) et al. Conclusions: In this study, we analyze the structure of the echocardiograph reports and summarized 149 phrases on diagnosis evidence of cardiogenic stroke. We use the phrases to generate an annotated corpus automatically, which greatly reduces the cost of manual annotation. The model trained based on the corpus also has a good performance on the testing set. The method of automatically identifying diagnosis evidence of cardiogenic stroke proposed in this study will be further refined in the practice. Keywords: Cardiogenic stroke, Diagnosis evidences, Chinese echocardiograph reports, BiLSTM-CRF * Correspondence: [email protected]; [email protected] † Lu Qin and Xiaowei Xu contributed equally to this work. 2 Beijing Tiantan Hospital, Capital Medical University, Beijing, China 1 Institute of Medical Information, Chinese Academy of Medical Sciences/ Peking Union Medical College, Beijing, China © The Author(s). 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you