Stacksbinder : online tool for visualizing and summarizing Stacks output to aid filtering of SNPs identified using RAD s

  • PDF / 821,720 Bytes
  • 3 Pages / 595.276 x 790.866 pts Page_size
  • 91 Downloads / 133 Views

DOWNLOAD

REPORT


TECHNICAL NOTE

Stacksbinder: online tool for visualizing and summarizing Stacks output to aid filtering of SNPs identified using RAD sequencing Masaki Yasugi1 · Ayumi Tezuka2 · Atsushi J. Nagano2 Received: 14 November 2017 / Accepted: 11 June 2018 © Springer Nature B.V. 2018

Abstract Stacksbinder (available at https​://ps.agr.ryuko​ku.ac.jp/stack​sbind​er/) is a web-based tool for summarizing the output of software Stacks, which is widely used to analyze restriction site-associated DNA sequencing (RAD-Seq) data. Although summarization is an essential step in RAD-Seq analysis, no specific tools exist for summarizing and visualizing these data. Stacksbinder generates a summary report by using files exported from Stacks. The report is HTML-based and easy to browse in any environment. It consists of plots and tables containing information that can be used to assess the quality of experiments and determine the filtering of single nucleotide polymorphisms. Keywords  Stacks · RAD-Seq · SNP detection · Visualization · Next-generation sequenceing · Summarization Restriction-site associated DNA sequencing (RAD-Seq) (Baird et al. 2008) and genotyping-by-sequencing (GBS) (Elshire et al. 2011) can detect thousands of genome-wide single nucleotide polymorphisms (SNPs). Software Stacks is one of the most widely used tool for analyzing RAD-Seq and GBS data (Catchen et al. 2013). Stacks users often face difficulty in deciding the SNP filtering criteria for subsequent analyses. Because RAD-Seq SNP data can contain a considerable amount of missing data (Rubin et al. 2012; Eaton and Ree 2013; Huang and Knowles 2016), the number of shared loci among samples could vary depending on the application of filtering parameters (Chattopadhyay et al. 2014). For example, distinguishing multiple alleles in a single locus and paralogs in multiple loci and determining the reliable depth are difficult. There are no universally Masaki Yasugi and Ayumi Tezuka have contributed equally to this work. Electronic supplementary material  The online version of this article (https​://doi.org/10.1007/s1268​6-018-1050-z) contains supplementary material, which is available to authorized users. * Atsushi J. Nagano [email protected] 1



Faculty of Engineering, Utsunomiya University, Utsunomiya, Tochigi 321‑8585, Japan



Faculty of Agriculture, Ryukoku University, Yokatani 1‑5, Seta Ohe‑cho, Otsu, Shiga 520‑2194, Japan

2

applicable parameter sets because they depend on read number per sample, genome size, ploidy, restriction enzyme(s), sample number, and the purpose of analysis. To determine parameters adapted for each experiment, users conduct a series of trials with various parameter sets while checking Stacks results. However, there is no specific tool to summarize and visualize these results. Therefore, if users do not obtain sufficient number of shared loci, determining whether the result is derived from problems with the experiment, because of the lack of sufficient number of sequencing reads, and/or from the parameters of Stacks pipeline, whi