Detection of SNPs based on transcriptome sequencing in Norway spruce ( Picea abies (L.) Karst)
- PDF / 486,536 Bytes
- 3 Pages / 595.276 x 790.866 pts Page_size
- 60 Downloads / 240 Views
TECHNICAL NOTE
Detection of SNPs based on transcriptome sequencing in Norway spruce (Picea abies (L.) Karst) Katrin Heer1,2 • Kristian K. Ullrich3 • Sascha Liepelt1 • Stefan A. Rensing3 Jiabin Zhou2,4 • Birgit Ziegenhagen1 • Lars Opgenoorth2
•
Received: 15 December 2015 / Accepted: 22 January 2016 Ó Springer Science+Business Media Dordrecht 2016
Abstract A novel set of SNPs was derived from transcriptome data of ten Norway spruce (Picea abies) trees from the Bavarian Forest National Park in Germany (BaFoNP). SNPs were identified by mapping against a de-novo transcriptome assembly and against pre-mRNAs of predicted genes of the reference genome assembly. This resulted in 111,849 and 366,577 SNPs, respectively. Out of these, 311 were either randomly selected or chosen because of their pronounced divergence between sampling sites and genotyped in 218 trees with an Illumina Infinium HD iSelect BeadChip. Keywords RNA-seq Single nucleotide polymorphism Norway spruce Genotyping
Introduction While costs for whole genome sequencing are dropping rapidly, the genome size of conifers of [20 Gbp (Nystedt et al. 2013) still renders whole genome analysis unfeasible. Thus, for genotyping of large populations SNPs, besides microsatellites (Opgenoorth 2009), are currently the genetic markers of choice in conifers. A number of SNPs were previously developed for or adopted in P. abies (Table 1, compare Lind et al. 2014). We aimed to add SNPs detected in Central Europe to increase the discriminating power relevant for questions regarding the management of local populations and conservation of forest genetic resources. For this we analyzed RNA-seq data of ten trees from BaFoNP sampled at low and high elevations (750 vs. 1350 m a.s.l.).
Katrin Heer and Kristian K Ullrich have equally contributed to this work
Methods and results
Electronic supplementary material The online version of this article (doi:10.1007/s12686-016-0520-4) contains supplementary material, which is available to authorized users.
RNA was extracted from needles and sequenced on an Illumina HiSeq 2500. We obtained 28.8 ± 6.8 Mio single end 100 bp reads per library. We trimmed bases dynamically with Trimmomatic v.0.32 (Bolger et al. 2014) and removed reads that mapped to mitochondrial or chloroplast genomes, or ribosomal DNA (custom-made database http://dx.doi.org/ 10.5061/dryad.f3r35) using bowtie2 v.2.2.3.0 (Langmead et al. 2009) resulting in 24.4 ± 3.0 Mio reads per library. Subsequently, we followed two SNP detection approaches. First, we searched for SNPs with differing allele frequency between sampling sites. We mapped the trimmed reads against a de-novo transcriptome assembly based on the 10 pooled libraries created with default settings in Trinity (Grabherr et al. 2011). We only retained the longest components from the assembly (167,736 contigs;
& Kristian K. Ullrich [email protected] 1
Conservation Biology, Philipps-Universita¨t Marburg, Karlvon-Frisch-Strasse 8, 35043 Marburg, Germany
2
Department of Ecology, Animal Ecol
Data Loading...