Rank Aggregation for Candidate Gene Identification

Differences of molecular processes are reflected, among others, by differences in gene expression levels of the involved cells. High-throughput methods such as microarrays and deep sequencing approaches are increasingly used to obtain these expression pro

PDF / 233,594 Bytes
9 Pages / 439.36 x 666.15 pts Page_size
79 Downloads / 357 Views

DOWNLOAD

REPORT

Abstract Differences of molecular processes are reflected, among others, by differences in gene expression levels of the involved cells. High-throughput methods such as microarrays and deep sequencing approaches are increasingly used to obtain these expression profiles. Often differences of gene expression across different conditions such as tumor vs inflammation are investigated. Top scoring differential genes are considered as candidates for further analysis. Measured differences may not be related to a biological process as they can also be caused by variation in measurement or by other sources of noise. A method for reducing the influence of noise is to combine the available samples. Here, we analyze different types of combination methods, early and late aggregation and compare these statistical and positional rank aggregation methods in a simulation study and by experiments on real microarray data.

1 Introduction Molecular high-throughput technologies generate large amounts of data which are usually noisy. Often measurements are taken under slightly different conditions and produce values that in extreme cases may be contradictory and contain outliers. A. Burkovski Research Group Bioinformatics and Systems Biology, Institute of Neural Information Processing, Ulm University, 89069 Ulm, Germany International Graduate School in Molecular Medicine, Ulm University, Ulm, Germany e-mail: [email protected] L. Lausser J.M. Kraus H.A. Kestler () Research Group Bioinformatics and Systems Biology, Institute of Neural Information Processing, Ulm University, 89069 Ulm, Germany e-mail: [email protected]; [email protected]; [email protected] M. Spiliopoulou et al. (eds.), Data Analysis, Machine Learning and Knowledge Discovery, Studies in Classification, Data Analysis, and Knowledge Organization, DOI 10.1007/978-3-319-01595-8__31, © Springer International Publishing Switzerland 2014

285

286

A. Burkovski et al.

One way of establishing more stable relationships between genes is by transforming the data into ordinal scale by ranking their expression values profile-wise. High expression levels are thereby sorted at the top of the ranking. Common patterns can be revealed by combining these rankings via aggregation methods. These methods construct consensus rankings for which all input rankings have least disagreements in some sense. Here, we study the difference between two general combination procedures, namely: (a) early and (b) late aggregation. In early aggregation, gene values are aggregated by methods like mean or median and are ranked based on the aggregated value. In contrast, late aggregation is the process of building a consensus ranking after the data was transformed into ordinal scale individually. To what extent early and late aggregation approaches differ was not reported so far. In this simulation study we observe, that the quality and the results depend strongly on the underlying noise model of the data. If we assume that each sample is affected by slightly different technica

Data Loading...

Rank Aggregation for Candidate Gene Identification

Recommend Documents

Candidate Gene

Rank Aggregation Using Moth Search for Web

Identification of Novel Candidate Biomarkers for Oral Squamous Cell Carcinoma Based on Whole Gene Expression Profiling

Two-Stage Session-Based Recommendations with Candidate Rank Embeddings

Candidate gene studies in hypodontia suggest role for FGF3

Genome-wide identification and expression analysis of the MLO gene family reveal a candidate gene associated with powder

Person Re-identification via Recurrent Feature Aggregation

Loci and candidate gene identification for soybean resistance to Phytophthora root rot race 1 in combination with associ

Identification of loci and candidate gene GmSPX-RING1 responsible for phosphorus efficiency in soybean via genome-wide a

On Improving the Efficiency of Majorization-Minorization for the Inference of Rank Aggregation Models

Further evidence for POMK as candidate gene for WWS with meningoencephalocele

Neurogenetics, Genome-Wide Association and Candidate Gene Studies