A review of software for microarray genotyping
- PDF / 429,183 Bytes
- 6 Pages / 609.449 x 790.866 pts Page_size
- 104 Downloads / 181 Views
A review of software for microarray genotyping Philippe Lamy,1,2 Jakob Grove1,3 and Carsten Wiuf1* 1
Bioinformatics Research Centre, C. F. Mollers Alle´ 8, Building 1110, DK-8000 Aarhus C, Denmark Department of Molecular Medicine, Aarhus University Hospital, Skejby, DK-8200 Aarhus N, Denmark 3 Department of Human Genetics, Aarhus University, The Bartholin Building, Wilhelm Meyers Alle´ 4, DK-8000, Aarhus C, Denmark *Correspondence to: Tel: þ45 8942 3100; Fax: þ45 8942 3077; E-mail: [email protected] 2
Date received (in revised form): 1st March 2011
Abstract The focus of this review is software for the genotyping of microarray single nucleotide polymorphisms, in particular software for Affymetrix and Illumina arrays. Different statistical principles and ideas have been applied to the construction of genotyping algorithms — for example, likelihood versus Bayesian modelling, and whether to genotype one or all arrays at a time. The release of new arrays is generally followed by new, or updated, algorithms. Keywords: SNP array, genotype, calling algorithm, copy number, intensity, software
Introduction The use of microarrays and microarray technology in research is now more than 15 years old and has had a tremendous impact on many aspects of research. Suddenly, it became possible to profile and survey whole genomes and to compare genomes across individuals and species to an extent that was hardly possible before. The perception of the genome changed as genome-wide data became available to everyone. This review focuses narrowly on software used for genotyping of single nucleotide polymorphisms (SNPs) in connection with SNP microarrays (or ‘arrays’ for short). There are an estimated ten million or more SNPs in the human genome.1 For each of these, there are three possible genotypes (assuming diploidy), AA, BB (homozygous) and AB (heterozygous), where A and B denote the two possible alleles. The first commercial SNP array was released in 1996 by Affymetrix (Santa Clara, CA) and targeted about 1,500 human SNPs,2 a tiny fraction of all SNPs. Since then, many different manufacturers have developed microarrays for genome-wide genotyping, including
304
Affymetrix, Agilent (Santa Clara, CA), Illumina (San Diego, CA) and Nimblegen (Madison, WI), with arrays designed for many different organisms. SNP arrays have found uses in many research areas and contexts — for example, association mapping,3 linkage disequilibrium mapping,4 phasing,5 inference on demography and ancestry,6 evolution7 and loss-of-heterozygosity analysis in cancer.8 Early usage of SNP arrays sought to estimate loss of heterozygosity in cancer by comparing DNA from germline and tumour cells.9 In addition, SNP arrays have been used to estimate copy numbers in cancers10 (similar to the use of comparative genomic hybridisation [CGH] arrays) and copy number variants (CNVs) in populations.11 The newest arrays from Affymetrix and Illumina both contain probes for CNVs and copy number polymorphisms (CNPs). Today, SNP microarrays are able to genotype more than a million
Data Loading...