Rediscovering Chemistry with the Bacon System

BACON.4 is a production system that discovers empirical laws. The program represents information at varying levels of description, with higher levels summarizing the levels below them. BACON.4 employs a small set of data-driven heuristics to detect regula

  • PDF / 1,547,474 Bytes
  • 23 Pages / 481.89 x 691.654 pts Page_size
  • 42 Downloads / 225 Views

DOWNLOAD

REPORT


BACON.4 is a production system that discovers empirical laws. The program represents information at varying levels of description, with higher levels summarizing the levels below them. BACON.4 employs a small set of datadriven heuristics to detect regularities in numeric and nominal data. These heuristics note constancies and trends, causing BACON.4 to formulate hypotheses, to define theoretical terms, and to postulate intrinsic properties. The introduction of intrinsic properties plays an important role in BACON.4's rediscovery of Ohm's law for electric circuits and Archimedes' law of displacement. When augmented with a heuristic for noting common divisors, the system is able to replicate a number of early chemical discoveries, arriving at Proust's law of definite proportions, Gay-Lussac's law of combining volumes, Cannizzaro's determination of the relative atomic weights, and Prout's hypothesis. The BACON.4 heuristics, including the new technique for finding common divisors, appear to be general mechanisms applicable to discovery in diverse domains. 10.1 INTRODUCTION

The years between 1800 and 1860 were active ones for chemistry. They saw the first quantitative measures of chemical reactions, the revival of the atomic theory, the painstaking determination of atomic weights, and the crowning success of the periodic table. The evolution of chemical thought has many parallels to the development of early physics in the previous century, but many R. S. Michalski et al. (eds.), Machine Learning © Springer-Verlag Berlin Heidelberg 1983

308

CHAPTER 10: REDISCOVERING CHEMISTRY WITH THE BACON SYSTEM

differences may be found as well. These similarities and differences have led us to apply our ideas about the discovery process, initially drawn from early physics, to the domain of chemistry. In this paper we report the results of that effort. BACON.4 is the fourth in a line of discovery systems developed by the authors. The earlier programs in this series merit some discussion, since their successes and failures have led directly to the current system. The prototype system, BACON. I [Langley, 1978], can be viewed as an implementation of the General Rule Inducer proposed by Simon and Lea [1974]. The program showed considerable generality by solving sequence extrapolation tasks, learning conjunctive and disjunctive concepts, and discovering simple physical laws. BACON.2 [Langley, 1979] included additional heuristics for dealing with sequential information; these let the program note recurring sequences of symbols and discover complex polynomial functions (including Bode's law) by examining differences. BACON. 3 [Langley, 1981] represented information at increasing levels of description, with higher levels describing more complex laws and accounting for more of the original data. This extended representation enabled the system to treat its hypotheses as new data, to which its heuristics could be applied recursively. BACON.3 successfully rediscovered versions of the ideal gas law, Coulomb's law, Kepler's third law, Ohm's law