The Design of a Software Environment for Organizing, Sharing, and Archiving Materials Data

  • PDF / 1,274,468 Bytes
  • 18 Pages / 593.972 x 792 pts Page_size
  • 60 Downloads / 157 Views

DOWNLOAD

REPORT


INTRODUCTION

A. Motivation

PRESENT-DAY computing and data acquisition capabilities easily generate very large data sets. Fine spatial discretization and temporal resolution often are combined to produce impressive amounts of data in a single experiment or simulation. Handling the data is a challenge involving questions of how the data are to be reduced for interpretation and how the data can be archived to assure that they are not lost over time or rendered useless by becoming disconnected from critical, descriptive information. Such issues are important as some of the experimental procedures, such as serial sectioning, take days, weeks, or even longer to complete and process. Others intensively use expensive facilities such as synchrotron light sources or parallel computing systems. Data sets become valuable assets not only to those directly involved with their creation, but to many other individuals who may wish to use the data in their own work. However, individuals must be able to locate the data, to determine the relevance of the data from the associated descriptive information, and to readily access the data if they are to exploit the data’s value. In this article, we describe the design and implementation of a software system for organizing, archiving, and sharing large, diverse materials data sets. To meet expectations, the system should be capable of the following: (1) disseminating data in their completeness beyond what is possible by the publication of images,

DONALD E. BOYCE, Programmer/Analyst, and PAUL R. DAWSON and MATTHEW P. MILLER, Professors, are with Cornell University, Ithaca, NY 14853. Contact e-mail: deb14@ cornell.edu Manuscript submitted November 20, 2008. Article published online August 19, 2009 METALLURGICAL AND MATERIALS TRANSACTIONS A

plots, or discussions found in archival journal publications; (2) enabling activities, such as modeling, that integrate complementary but distinct data sets; (3) facilitating technical interactions among researchers who bring a wide array of resources to bear on a common topic; and (4) archiving data so that they may be reliably accessed by any researcher, whether that person is the data contributor or someone else who is working in a laboratory or on a project unrelated to that of the contributor. To address these goals, the primary requirements of the system include the following: a clear organizational structure, the ability to handle large data sets, searchability by germane descriptors, and a basic visualization capability. The system should be flexible and maintainable, so that it can adapt readily as its use develops and its requirements evolve. The system should be robust, accepting many types of data while assuring the data’s integrity through the processes for entering data sets, archiving them, and retrieving them. Finally, the system should be easy to use, so that it can be easily incorporated into a person’s workflow. We begin the description with a review of related literature. In Section II, the core structure and details of the system’s impl