Natural Language Processing in Health Care and Biomedicine

After reading this chapter, you should know the answers to these questions: Why is natural language processing important? What are the potential uses for natural language processing (NLP) in the biomedical and health domains? What forms of knowledge are u

  • PDF / 649,335 Bytes
  • 30 Pages / 504.567 x 720 pts Page_size
  • 50 Downloads / 184 Views

DOWNLOAD

REPORT


Natural Language Processing in Health Care and Biomedicine Carol Friedman and Noémie Elhadad

After reading this chapter, you should know the answers to these questions: • Why is natural language processing important? • What are the potential uses for natural language processing (NLP) in the biomedical and health domains? • What forms of knowledge are used in NLP? • What are the principal techniques of NLP? • What are challenges for NLP in the clinical, biological, and health consumer domains?

8.1

 otivation for Natural M Language Processing

Natural language is the primary means of human communication. In biomedical and health areas,1 knowledge and data are disseminated in textual form as articles in the scientific literature, as technical and administrative reports on the Web, and as textual fields databases. In health care facilities, patient information mainly occurs in narrative notes and reports. Because of the ­growing adoption of electronic health records and

 Unless stated otherwise, the general domain and the t­opics of text materials discussed in this chapter refer to biomedicine and health. 1

C. Friedman, PhD (*) • N. Elhadad, PhD Department of Biomedical Informatics, Columbia University, 622 West 168th Street, VC Bldg 5, New York 10032, NY, USA e-mail: [email protected]; [email protected]

the promise of health information exchange, it is common for a patient to have records at multiple facilities, and for a chart of a single patient at one institution to comprise several hundred notes. Because of the explosion of online textual information available, it is difficult for scientists and health care professionals to keep up with the latest discoveries, and they need help to find, manage, and analyze the enormous amounts of online knowledge and data. On the Web, individuals exchange and look for health-related information, and health consumers and patients are often overwhelmed by the amount of the information available to them, whether in traditional websites or through online health communities. There is also much information disseminated verbally through scientific interactions in conferences, in care teams at hospitals, and in patient-doctor encounters. In this chapter however, we focus on the written form. While there is valuable information conveyed in text, it is not in a format amenable to further computer processing. Texts are difficult to ­process reliably because of the inherent characteristics and variability of language. Since structured standardized data are more useful for most a­ utomated applications, a significant amount of manual work is currently devoted to mapping textual information into a structured or coded representation: in the clinical realm, for instance, professional This chapter is adapted from an earlier version in the third edition authored by Carol Friedman and Stephen B. Johnson.

E.H. Shortliffe, J.J. Cimino (eds.), Biomedical Informatics, DOI 10.1007/978-1-4471-4474-8_8, © Springer-Verlag London 2014

255

256

coders assign billing codes correspo