Generalized machine learning technique for automatic phase attribution in time variant high-throughput experimental stud

  • PDF / 309,192 Bytes
  • 11 Pages / 584.957 x 782.986 pts Page_size
  • 95 Downloads / 223 Views

DOWNLOAD

REPORT


Shizhong Han,b) Yan Zhang, Yan Tong, and Jianjun Hu Department of Computer Science and Engineering, University of South Carolina, Columbia, South Carolina 29208, USA

Jason R. Hattrick-Simpersa) Department of Chemical Engineering, University of South Carolina Columbia, South Carolina 29208, USA; and SmartState Center for the Strategic Approaches to the Generation of Electricity, University of South Carolina Columbia, South Carolina 29208, USA (Received 16 December 2014; accepted 5 March 2015)

Phase identification is an arduous task during high-throughput processing experiments, which can be exacerbated by the need to reconcile results from multiple measurement techniques to form a holistic understanding of phase dynamics. Here, we demonstrate AutoPhase, a machine learning algorithm, which can identify the presence of the different phases in spectral and diffraction data. The algorithm uses training data to determine the characteristic features of each phase present and then uses these features to evaluate new spectral and diffraction data. AutoPhase was used to identify oxide phase growth during a high-throughput oxidation study of NiAl bond coats that used x-ray diffraction, Raman, and fluorescence spectroscopic techniques. The algorithm had a minimum overall accuracy of 88.9% for unprocessed data and 98.4% for postprocessed data. Although the features selected by AutoPhase for phase attribution were distinct from those of topical experts, these results show that AutoPhase can substantially increase the throughput highthroughput data analysis.

I. INTRODUCTION

Since its announcement three years ago, the materials genome initiative has resulted in a marked uptick in the number and quality of new materials discovered using theoretical approaches such as density functional theory (DFT) and calculated phase diagrams (CALPHAD). Recent reports have highlighted that, to fully realize the aspirations of mapping the materials genome, the theoretical–experimental loop must be closed.1,2 In such closed-loop studies, experimental and theoretical work are carried out in parallel with curated data made available across the computation/experimentation divide to guide and refine the progress of both efforts. High-throughput experimental (HTE) methodologies have been proposed by several recent reports as the key technology for providing the empirical databases and validation studies necessary to strengthen the Contributing Editor: Susan B. Sinnott a) Address all correspondence to this author. e-mail: [email protected] b) Authors contributed equally to the work This paper has been selected as an Invited Feature Paper. DOI: 10.1557/jmr.2015.80 J. Mater. Res., Vol. 30, No. 7, Apr 14, 2015

computational aspirations of the MGI.1–4 In the HTE approach, tens to hundreds of samples are synthesized in parallel, processed, and then rapidly characterized via either parallel or serial measurement techniques. The approach was initially bottlenecked by the lack of sufficiently reliable and rapid characterization tools, however today a myriad of