Materials science with large-scale data and informatics: Unlocking new opportunities

  • PDF / 2,992,525 Bytes
  • 11 Pages / 585 x 783 pts Page_size
  • 30 Downloads / 184 Views

DOWNLOAD

REPORT


Introduction Data-intensive science has been described as the “fourth paradigm” for scientific exploration, with the first three being experiments, theory, and simulation.1 While the value of dataintensive research approaches are becoming more apparent, the field of materials science has not yet experienced the same widespread adoption of these methods (as has occurred in biosciences,2 astronomy,3 and particle physics4). Nonetheless, the potential impact of data-driven materials science is tremendous: Materials informatics could reduce the typical 10–20 year development and commercialization cycle5 for new materials. We see plentiful opportunities to use data and data science to radically reduce this timeline and generally advance materials research and development (R&D) and manufacturing. In this article, we discuss the current state of affairs with respect to data and data analytics in the materials community, with a particular emphasis on thorny challenges and promising initiatives that exist in the field. We conclude with a set of near-term recommendations for materials-data stakeholders. Our goal is to demystify data analytics and give readers from any subdiscipline within materials research enough information to understand how informatics techniques could apply to their own workflows.

Challenges surrounding data: The status quo in materials There are five principal barriers to broader data sharing and large-scale meta-analysis within the field of materials science. This section enumerates and discusses the following barriers in depth: (1) opaque buzzwords in materials informatics, which prevent a typical materials scientist from readily seeing how data-driven methods could apply to their work; (2) idiosyncrasies in individual researchers’ preferred data workflows; (3) a wide variety of stakeholders, who often have conflicting goals, hailing from corresponding diverse research areas; (4) limited availability of structured data and agreed-upon data standards; and (5) a lack of clear incentives to share data.

Proliferation of buzzwords Like many areas of science, materials informatics is unfortunately hamstrung by the proliferation of buzzwords whose meanings are not clear to researchers in the broader materials community. To a first approximation, machine learning, data mining, and artificial intelligence are roughly interchangeable and refer to the use of algorithms to approximately model patterns in data. Materials informatics, in analogy to bioinformatics,

Joanne Hill, Citrine Informatics, USA; [email protected] Gregory Mulholland, Citrine Informatics, USA; [email protected] Kristin Persson, Lawrence Berkeley National Laboratory, USA; [email protected] Ram Seshadri, University of California, Santa Barbara, USA; [email protected] Chris Wolverton, Northwestern University, USA; [email protected] Bryce Meredig, Citrine Informatics, USA; [email protected] doi:10.1557/mrs.2016.93

© 2016 Materials Research Society

MRS BULLETIN • VOLUME 41 • MAY 2016 • www.mrs.org/bulletin

399

MATERIALS SCIENCE WITH LARGE-SCALE