Data Mining in and around Crystal Structure Databases

  • PDF / 227,937 Bytes
  • 4 Pages / 576 x 801 pts Page_size
  • 91 Downloads / 223 Views

DOWNLOAD

REPORT


Introduction Data mining is an activity that processes existing data in view of creating new information. Nominally, mining applies to any kind of data, such as census data, accident records, stock exchange data, or geochemical archives, with entire journals, such as the SIGMOD Record, devoted to this activity. We restrict the concept here to the mining of crystallographic data stored in crystal structure databases. The mining activity can be as straightforward as the derivation of statistics on entries, like spacegroup frequencies. It can also be mindboggling, like the establishment of complex physical properties never measured before, based on just the knowledge of space group, cell data, and atom positions. An example of such a complex physical property is a radial plot of the small shear coefficient in lonsdaleite, a carbon polymorph related to diamond, which can be found in Reference 1.

Purpose of Mining Database Entries for Materials Science Pinpointing a material with an optimal combination of physical properties dictated by a given purpose is the ultimate goal of data mining for materials science. As physical properties of materials are

accessible to quantum mechanical calculations, there are no theoretical impossibilities in achieving this goal some day, but there are still practical ones. The ongoing search for ultrahard materials summarized, for example, in Reference 2 underscores the fact that the state of the art is not close to that goal yet. What data mining can currently do is propose a list of candidate materials satisfying a number of criteria, some of them compelling for the very existence of the property, while others are semiempirical criteria for the selection of materials with ballpark values for the property. What is obtained is a short list of candidate materials to be then screened experimentally for the best combination of properties.

Crystal Structure Databases The phenomenon of x-ray diffraction was discovered in 1912 by von Laue and was quickly followed by the determination of crystal structures for elements and simple compounds. Books rationalizing the known crystal structures started coming out in the early 1920s. Starting in 1931, the first volume of the authoritative series Strukturbericht 3 (later renamed Structure Reports) appeared, exhaustively covering

MRS BULLETIN • VOLUME 31 • DECEMBER 2006 • www/mrs.org/bulletin

the period from 1913 to 1928. Sufficient details were printed to allow recreation of a crystal structure without having to consult the original paper. The number of crystal structures published each year remained manageable in paper form until the early 1960s. The advent in quick succession of least-squares algorithms on solid-state computers, powerful structure-solution methods, and automated diffractometry considerably increased the structure publication rate. It is no accident that electronic edited crystal structure databases were created around that date, splitting the field into metals, inorganics, and organics for reasons of literature-scanning convenienc