Plant Metabolomics Methods and Protocols

Estimation of the metabolite complement of plant material involves a wide range of techniques and technologies and that breadth continues to increase.  Metabolomics research typically involves multiple sites for material preparation and analysis and

  • PDF / 532,761 Bytes
  • 17 Pages / 504.57 x 720 pts Page_size
  • 52 Downloads / 273 Views

DOWNLOAD

REPORT


1. Introduction Data mining uses a wide range of modelling techniques involving machine learning, pattern recognition, statistics, and clustering algorithms (1–3). In metabolomics, data mining is performed either in a hypothesis-driven fashion where it seeks an answer to a preset research question or in a data-driven fashion where it seeks to discover patterns, trends, or associations which might be completely different from those intended when the data were originally acquired. However, hypothesis-driven and data-driven investigations can both be seen as part of the knowledge cycle, (2) where each might lead to the other. The first is used for deducing knowledge through testing a preset hypothesis, while the second might be used for inducing knowledge from data and generating new hypotheses for further investigations (2, 3).

Nigel W. Hardy and Robert D. Hall (eds.), Plant Metabolomics: Methods and Protocols, Methods in Molecular Biology, vol. 860, DOI 10.1007/978-1-61779-594-7_18, © Springer Science+Business Media, LLC 2012

317

318

A.H. BaniMustafa and N.W. Hardy

Formalizing a framework strategy for conducting data mining, which focuses on providing a mechanism for the selection of data mining techniques, provides several benefits. It encourages the achievement of the aims of a metabolomics study as well as ensuring justifiability of technique choice throughout the analysis. It also provides traceability of the procedures applied and ultimately, supports the reproducibility of the investigation outcomes. In this chapter, we describe a strategy for selecting data mining modelling techniques. In Subheading 2, we provide an overview of the inputs required for the selection, while in Subheading 3 we describe the methods to be used for performing the steps of the strategy. Notes are provided to define concepts, suggest alternatives, or to expand the discussion.

2. Materials (Inputs for the Selection)

Here, we describe the important inputs to the selection of techniques. The first focuses on understanding the aims of the metabolomics study and their relation to the research investigation and the data acquisition assays (see Note 1). The second input is related to the understanding of the general goals of data mining, the tasks which are performed and the techniques used to achieve these goals. The third concerns the nature and quality of metabolomics data. In addition to the inputs discussed in this section, it is also important to consider other factors concerning the application of the techniques in practice. These include data pre-processing and data acclimatization in addition to management and technical issues such as planning, project management, feasibility, and the availability of software tools and expertise (4–6).

2.1. The Aims of a Metabolomics Study

Data mining modelling techniques are used in metabolomics, either in an hypothesis-driven or in a data-driven fashion, to fulfil the aims of a study and consequently answer the question of the research investigation. Accordingly, the aims of a metabolom

Data Loading...