An Improved Process for the Creation, Maintenance, and Documentation of Analysis-ready Data

PDF / 6,513,179 Bytes
5 Pages / 648 x 864 pts Page_size
20 Downloads / 261 Views

Allan Glaser Associate Director; Scientific Programming, Merck S.Co., Blue Bell, Pennsylvania

Key Words Analysis data sets: Process improvement; Data dictionary Correspondence Address Allan Glaser; Merck & Co., Inc., Mail Stop UNA-102. 785 Jolly Road, Blue Bell, PA 19422. SAS and all other SAS Institute lnc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the United States and other countries. @ indicates US registration.

An Improved Process for the Creation, Maintenance, and Documentation of Analysis-ready Data

INTRODUCTION The collection, analysis, and reporting of clinical trial data are inherently complex. As data move through various stages of the process, they typically reside in two disparate repositories. The first repository is usually based on a commercially available database management system and provides a foundation of robust data structures that are well defined and documented. The structures in this repository tend to reflect the source data. The second repository is intended to contain manipulated data (ie, data more amenable to analysis and reporting). It is not uncommon for that repository to be incompletely defined or poorly documented, and it may not provide an optimal foundation for later work. It is essentially a nonstandardized and minimally controlled environment. Inherent weaknesses may manifest themselves through technical errors, frustration, rework, and ultimately reduced productivity. The intent of this article is to present a general methodology for at least partially improving this situation.

A N A L Y S I S DATA SETS The second repository is usually developed with S A P and related computer programming tools and typically consists of a collection of SAS data sets. The constituent data sets may include information at the project, protocol, patient, visit, Drug Information Journal. Vol. 40, pp. 331 -335.2006 Printed in the USA. All rights reserved. Copyright 0 2006

331

The development of analysis-ready data for clinical trials is a complicated process that is often problematic. Not only are the data inherently complex, but also a n e c a l and reporting requirements tend to evobe over a period of time. Additional difficulties arise because of the number of individuals working with the data and the varied uses of the data. A new approach is described that greutly facilitates the creation, maintenance, and documentation of these data. Results are encoumninp and include contributing to high quality and improved produh'vity. C

B

I

2

and event levels, and the overall structure of these data sets is critical for the proper understanding and analysis of the data. Each data set has intrinsic attributes, such as its name, a label that identifies its function, and its size. Similarly, each variable within the data sets has intrinsic attributes, including its name, label, type (eg, numeric or character), length, format, and row position. For example, the variable "age" may have the label "Patient Age," be defined as a 4-byte numeric, have a for

Data Loading...

An Improved Process for the Creation, Maintenance, and Documentation of Analysis-ready Data

Recommend Documents

An ontology-based documentation of data discovery and integration process in cancer outcomes research

Competitiveness Creation and Maintenance in the Postal Services Industry

An improved deep forest for alleviating the data imbalance problem

Update on the creation and maintenance of arteriovenous fistulas for haemodialysis in children

The IEEE-FIPA Standard on the Design Process Documentation Template

Operation and Maintenance of Data Centers

Documentation

SICE: an improved missing data imputation technique

An Improved Process for the Production of Low-Carbon Ferromanganese in the Electric Arc Furnace

An Improved Conditional Generative Adversarial Network for Microarray Data

K-DBSCAN: An improved DBSCAN algorithm for big data

Uninterrupted Knowledge Creation Process Philosophy and Autopoietic