Communication and re-use of chemical information in bioscience

  • PDF / 1,749,314 Bytes
  • 16 Pages / 610 x 792 pts Page_size
  • 95 Downloads / 171 Views

DOWNLOAD

REPORT


BioMed Central

Open Access

Commentary

Communication and re-use of chemical information in bioscience Peter Murray-Rust*1, John BO Mitchell1 and Henry S Rzepa2 Address: 1Unilever Centre for Molecular Science Informatics, Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge. CB2 1EW, UK and 2Department of Chemistry, Imperial College London, SW7 2AY, UK Email: Peter Murray-Rust* - [email protected]; John BO Mitchell - [email protected]; Henry S Rzepa - [email protected] * Corresponding author

Published: 18 July 2005 BMC Bioinformatics 2005, 6:180

doi:10.1186/1471-2105-6-180

Received: 17 May 2005 Accepted: 18 July 2005

This article is available from: http://www.biomedcentral.com/1471-2105/6/180 © 2005 Murray-Rust et al; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract The current methods of publishing chemical information in bioscience articles are analysed. Using 3 papers as use-cases, it is shown that conventional methods using human procedures, including cut-and-paste are time-consuming and introduce errors. The meaning of chemical terms and the identity of compounds is often ambiguous. valuable experimental data such as spectra and computational results are almost always omitted. We describe an Open XML architecture at proof-of-concept which addresses these concerns. Compounds are identified through explicit connection tables or links to persistent Open resources such as PubChem. It is argued that if publishers adopt these tools and protocols, then the quality and quantity of chemical information available to bioscientists will increase and the authors, publishers and readers will find the process cost-effective.

Introduction In a previous article [1] we have argued the value of extracting the chemical information in bioscientific research, transforming it to XML and redisseminating it openly. The present article expands on the technical and cultural infrastructure required to support this. The technical aspects have been solved to proof-of-concept stage and we are starting to embark on experiments in the social domain. In this we thank BMC for inviting us to submit this and we present a model here which we believe could be attractive for bioscience publishers and their community. We concentrate on the current publication of chemistry in bioscience. This includes: 1. mention of chemical compounds.

2. details of synthesis (in vivo and in vitro) of compounds. 3. proof of structure (spectra and analytical data). 4. Methods and reagents in bioscience bio-protocols 5. properties of compounds. 6. reactions and their properties, both in enzymes and enzyme-free systems. This type of chemistry is very well understood and has a simple ontology which has not changed over decades[2]. Unlike much bioscience, where ontological tools are an