Embedding domain knowledge for machine learning of complex material systems

  • PDF / 916,306 Bytes
  • 15 Pages / 612 x 792 pts (letter) Page_size
  • 34 Downloads / 269 Views

DOWNLOAD

REPORT


Artificial Intelligence Prospective

Embedding domain knowledge for machine learning of complex material systems Christopher M. Childs, Washburn Laboratory, Department of Chemistry, Carnegie Mellon University, 4400 Fifth Avenue, Pittsburgh, PA 15213, USA Newell R. Washburn, Department of Chemistry, Carnegie Mellon University, 4400 Fifth Avenue, Pittsburgh, PA 15213, USA; Department of Biomedical Engineering, Carnegie Mellon University, 4400 Fifth Avenue, Pittsburgh, PA 15213, USA Address all correspondence to Newell R. Washburn at [email protected] (Received 5 March 2019; accepted 26 June 2019)

Abstract Machine learning (ML) has revolutionized disciplines within materials science that have been able to generate sufficiently large datasets to utilize algorithms based on statistical inference, but for many important classes of materials the datasets remain small. However, a rapidly growing number of approaches to embedding domain knowledge of materials systems are reducing data requirements and allowing broader applications of ML. Furthermore, these hybrid approaches improve the interpretability of the predictions, allowing for greater physical insights into the factors that determine material properties. This review introduces a number of these strategies, providing examples of how they were implemented in ML algorithms and discussing the materials systems to which they were applied.

Introduction Many important materials are defined by a single underlying interaction or force, which allows modeling using analytical expressions having relatively few parameters. Examples include ferromagnets, where the magnetism is described by the exchange interactions between spins[1] and elastomers, where the resistance to deformation is due to polymer chain entropy.[2] In contrast, the properties of complex materials are determined by multiple competing forces, the interplay of which lead to a rich diversity of physical properties and performance characteristics. Complex materials, such as complex fluids,[3] metal alloys,[4] and catalysts,[5] are ubiquitous, but predicting their properties remains a significant challenge. Machine learning (ML) is a diverse collection of powerful techniques utilized to identify relationships in data, allowing for modeling and optimization of complex systems. With rapidly growing datasets available, ML has become a robust methodology applied across many materials disciplines and has been increasingly incorporated in conjunction with the Materials Genome Initiative.[6–8] However, the traditional methods of ML are based only on statistical inference, requiring large datasets to develop predictive models that connect composition and processing with properties. While some disciplines within materials science, such as metallurgy[9] or heterogeneous catalysis,[10] have developed methods for high-throughput experimentation to produce sufficiently large datasets, most disciplines still use traditional methods of materials preparation and analysis, precluding the use of ML methods designed for Big Data.[11,