Bisociative Literature-Based Discovery: Lessons Learned and New Word Embedding Approach

  • PDF / 1,563,280 Bytes
  • 28 Pages / 439.37 x 666.142 pts Page_size
  • 47 Downloads / 186 Views

DOWNLOAD

REPORT


Bisociative Literature‑Based Discovery: Lessons Learned and New Word Embedding Approach Nada Lavrač1,2 · Matej Martinc1,3 · Senja Pollak1 · Maruša Pompe Novak4 · Bojan Cestnik1,5  Received: 21 March 2020 / Accepted: 8 September 2020 © The Author(s) 2020

Abstract The field of bisociative literature-based discovery aims at mining scientific literature to reveal yet uncovered connections between different fields of specialization. This paper outlines several outlier-based literature mining approaches to bridging term detection and the lessons learned from selected biomedical literature-based discovery applications. The paper addresses also new prospects in bisociative literaturebased discovery, proposing an advanced embeddings-based technology for crossdomain literature mining. Keywords  Literature-based discovery · Cross-domain bisociations · Computational creativity · Embeddings technology

* Bojan Cestnik [email protected] Nada Lavrač [email protected] Matej Martinc [email protected] Senja Pollak [email protected] Maruša Pompe Novak [email protected] 1

Jožef Stefan Institute, Jamova 39, 1000 Ljubljana, Slovenia

2

University of Nova Gorica, Vipavska 13, 5000 Nova Gorica, Slovenia

3

Jožef Stefan International Postgraduate School, Jamova 39, 1000 Ljubljana, Slovenia

4

National Institute of Biology, Večna pot 111, 1000 Ljubljana, Slovenia

5

Temida d.o.o., Dunajska cesta 51, 1000 Ljubljana, Slovenia



123

Vol.:(0123456789)



New Generation Computing

Introduction Growing amounts of available knowledge and data exceed human analytic capabilities. Therefore, new technologies that help analyzing and extracting useful information from large amounts of data need to be developed and used for analytic purposes. Understanding complex phenomena and solving difficult problems often require knowledge from different domains to be combined and cross-domain associations to be considered. While the concept of association is at the heart of several information technologies, including information retrieval and data mining, and in particular association rule learning [2], scientific discovery requires creative thinking to connect seemingly unrelated information, for example, using metaphors or analogies between concepts from different domains. These kinds of context crossing associations, called bisociations [19], are often needed for innovative discoveries. This paper addresses a computational creativity task of bisociative knowledge discovery from scientific literature that we name bisociative literature-based discovery. This task is at the intersection of two research areas: literature-based discovery [6] and bisociative knowledge discovery [3], which are briefly introduced below. In literature-based discovery (LBD) [6]—and in particular in cross-domain literature mining that addresses knowledge discovery from two (or more) initially separate document corpora—a crucial step is the identification of interesting bridging terms (b-terms) or links (b-links) that carry the potential of explicitly revealin