A Case Study of the Incremental Utility for Disease Identification of Natural Language Processing in Electronic Medical

  • PDF / 638,119 Bytes
  • 7 Pages / 595.276 x 790.866 pts Page_size
  • 48 Downloads / 156 Views

DOWNLOAD

REPORT


SHORT COMMUNICATION

A Case Study of the Incremental Utility for Disease Identification of Natural Language Processing in Electronic Medical Records Lisa S. Weiss1 • Xiaofeng Zhou1 • Alexander M. Walker2 • Ashwin N. Ananthakrishnan3 • Rongjun Shen1 • Rachel E. Sobel1 • Andrew Bate1 • Robert F. Reynolds1

Ó Springer International Publishing AG, part of Springer Nature 2017

Abstract Background Information exists as unstructured medical text in healthcare databases. Such information is not routinely considered in safety surveillance but typically relies solely on structured (coded) data. Natural language processing (NLP) may allow the capture of concepts from unstructured data and thus enhance safety surveillance capability. Objectives We sought to assess the added contribution of unstructured data extracted from medical text by NLP for detecting acute liver dysfunction (ALD) in patients with inflammatory bowel disease (IBD).

Methods Using a previously developed rule, we evaluated structured and unstructured NLP-extracted terms from a commercially available electronic medical record (EMR) system. The rule was intended to identify ALD diagnosis and timing of onset and was the result of three iterations of rule development using 150 ALD candidate cases. We evaluated the performance of the rule with or without NLP among all candidate cases and among 50 new cases with clinical adjudication. Results NLP terms were necessary for the diagnosis of 9% of cases and for ruling out 3% of false-positive cases. Inclusion of NLP terms led to an identification of an additional 9% of ALD-onset dates, with consequent earlier recognition in 5%. Conclusions NLP-derived terms in one large commercially available EMR system modestly improved the sensitivity and specificity in the identification of ALD and identified earlier onset.

Lisa S. Weiss and Xiaofeng Zhou contributed to this manuscript with an equal amount of effort and both take responsibility for the integrity of the work as a whole, as joint first authors. & Lisa S. Weiss [email protected]

Andrew Bate [email protected]

Xiaofeng Zhou [email protected] Alexander M. Walker [email protected] Ashwin N. Ananthakrishnan [email protected] Rongjun Shen [email protected] Rachel E. Sobel [email protected]

Robert F. Reynolds [email protected] 1

Epidemiology, Research and Development, Worldwide Safety and Regulatory, Pfizer, New York, NY 10017, USA

2

WHISCON, Newton, MA 02466, USA

3

Division of Gastroenterology, Crohn’s and Colitis Center, Massachusetts General Hospital, Boston, MA 02114, USA

L. S. Weiss et al.

2 Patients and Methods Key Points 2.1 Study Design The contribution of natural language processing (NLP) terms to defining additional cases and moving the onset of identification of cases earlier highlights its potential value in case identification and an earlier and more accurate safety signal detection. NLP data helped distinguish common sources of liver dysfunction in patients with inflammatory bowel disease to