PVAF: an environment for disambiguation of scientific publication venues

  • PDF / 1,035,621 Bytes
  • 15 Pages / 595.276 x 790.866 pts Page_size
  • 89 Downloads / 160 Views

DOWNLOAD

REPORT


PVAF: an environment for disambiguation of scientific publication venues Tiago Antônio Paraizo1 · Denilson Alves Pereira1 Received: 19 November 2019 / Revised: 28 May 2020 / Accepted: 13 July 2020 © Springer-Verlag GmbH Germany, part of Springer Nature 2020

Abstract A publication venue authority file stores variants of the names of journals and conferences that publish scientific articles. It is useful in the construction of search tools and data disambiguation, and it is of special interest to agencies funding research and evaluating graduate programs, which use the quality of publication venues as a basis for evaluating researchers’ and research groups’ publications. However, keeping an updated authority file is not a trivial task. Different names are used to refer to the same publication venue, these venues sometimes change their name, new venues emerge regularly, and journal bibliometrics are updated frequently. This paper presents the publication venue authority file (PVAF), an environment for the disambiguation of scientific publication venues. It consists of an authority file and a set of tools for updating and querying its data. We describe and experimentally evaluate each of these tools. We also propose a search algorithm based on an associative classifier, which allows for incremental updates of its learning model. The results show that the PVAF has coverage greater than 86% for publication venues in several fields of knowledge, and its tools attain a good accuracy in the classification of publication venues from curricula vitae formatted in various citation styles. Keywords Authority file · Publication venue · Citation · Data disambiguation · Associative classifier · Incremental learning

1 Introduction Research funding and program evaluation agencies use the impact of scientific publications to make decisions about project funding and to evaluate graduate programs. In Brazil, the Coordination for the Improvement of Higher Education Personnel (Capes)1 created the Qualis, a metric used to classify the scientific production of graduate programs based on the reputation of the publication venue of the articles published by each program. Other internationally recognized metrics also evaluate the quality of publication venues, such as the Journal Impact Factor (JIF)2 , the Scopus CiteScore3 ,

B

Denilson Alves Pereira [email protected] Tiago Antônio Paraizo [email protected]

1

Department of Computer Science, Federal University of Lavras, Lavras PO Box 3037, 37.200-900, Brazil

1

http://www.capes.gov.br/

2

https://www.webofknowledge.com/JCR

3

https://www.scopus.com/sources

the SCImago Journal & Country Rank (SJR)4 and Google Scholar metrics5 . However, for these metrics to be effectively used by information retrieval systems, the name of the publication venue (journal, conference, workshop) in the string of characters associated with each citation must be recognized (understand citation as a bibliographic record containing features about a particular publication, such as author names, work t