Data-driven discovery of formulas by symbolic regression
- PDF / 1,943,674 Bytes
- 6 Pages / 585 x 783 pts Page_size
- 18 Downloads / 283 Views
on Theories and models in materials science and engineering are usually represented by mathematical formulas, which play vital roles in the understanding of materials behavior and performance as well as in the development of advanced materials and technologies. Before machine learning (ML) became available, theories and models were often, and even now continue to be, developed through intuition, derivation, or summarization from the accumulation of experience and knowledge of materials behavior and performance. This model-developing paradigm, however, requires long times and huge effort, as well as the genius of humans. The rapid development of artificial intelligence (AI) techniques today provides the opportunity for automatic model construction. The new paradigm of data-driven science using AI techniques such as ML to discover knowledge and science from data generated from both experiments and computational simulations has rapidly expanded in the field of materials science and engineering. This paradigm combined with expert domain knowledge yields state-of-the-art methodologies for model development.1 Langley developed an AI program in 1979 with a few simple heuristics to solve a broad range of tasks (e.g.,
successful rediscovery of the ideal gas law, Kepler’s Third Law, Coulomb’s Law, Ohm’s Law, and Galileo’s Law).2 These models are among the simplest ones. A scientist’s dream is to find natural laws in explicit analytic expressions automatically from data. In 2009, Schmidt and Lipson showed that it was possible to distill underlying physical laws of two conservative systems, air-track oscillators and double pendulums, by employing SR based on genetic programming (GP).3 An air-track oscillator has fine holes in its surface from which a layer of air is pumped to minimize the friction experienced by a car or several cars gliding on the track. Every car is connected to its nearest neighboring cars or the track end by springs. In these two experiments, they recorded the position of cars on the air track and the position of the pendulums over time by using motion-tracking software and then calculated the velocities and accelerations of the oscillators. They successfully rediscovered the underlying equations describing the designed systems: Hamiltonian or Lagrangian equations were recovered if coordinates and velocities were used as data, and the equation of motion was recovered if coordinates, velocities, and accelerations were all used as data. Besides SR, feature (or descriptor) selection using a ML technique termed the least absolute shrinkage and
Sheng Sun, Materials Genome Institute, Shanghai University, China; [email protected] Runhai Ouyang, Materials Genome Institute, Shanghai University, China; [email protected] Bochao Zhang, Materials Genome Institute, Shanghai University, China; [email protected] Tong-Yi Zhang, Materials Genome Institute, Shanghai University, China; [email protected] doi:10.1557/mrs.2019.156
559
• VOLUME © 2019 Materials Research Society MRSCore BULLETIN 44 • JULY 2019 • www.mrs.org/
Data Loading...