Virtual Sequencing of pUC18c

The objective of this exercise is to sequence the pUC18c cloning vector. It is divided into the following major parts: (a) downloading the cloning vector, (b) run a virtual sequencing program, and (c) assembling the virtual sequences obtained from sequenc

  • PDF / 514,324 Bytes
  • 15 Pages / 439.37 x 666.142 pts Page_size
  • 101 Downloads / 210 Views

DOWNLOAD

REPORT


Virtual Sequencing of pUC18c

20.1 The Project The objective of this exercise is to sequence the pUC18c cloning vector. It is divided into the following major parts: (a) downloading the cloning vector, (b) run a virtual sequencing program, and (c) assembling the virtual sequences obtained from sequencing.

20.2 Bioinformatic Tools In this project you will apply several software tools which are freely available. All programs can be downloaded from the web page accompanying this book: http:// www.staff.hs-mittweida.de/~wuenschi/doku.php?id=rwbook2.

20.2.1 Sequencer sequencer is a virtual sequencing machine developed by Bernhard Haubold. You supply the software with a genome sequence and tell it the amount of coverage and average length of sequence reads you wish to obtain. Thus, it simulates the performance of a DNA-sequencer (average sequence length) and the effort you put into lab work (coverage).

20.2.2 Assembler The TIGR Assembler is a classic open-access assembly tool developed by The Institute for Genomic Research (TIGR). Its objective is to build all possible consensus

R. Wünschiers, Computational Biology, DOI: 10.1007/978-3-642-34749-8_20, © Springer-Verlag Berlin Heidelberg 2013

393

394

20 Virtual Sequencing of pUC18c

sequences (contigs) from smaller sequence fragments. These fragments typically come from genome sequencing.

20.2.3 Dotter Dotter is a graphical dot plot program for detailed comparison of two sequences (Sonnhammer and Durbin 1995). Every nucleotide or amino acid from one sequence is compared to every other sequence. The first sequence runs along the x-axis and the second sequence along the y-axis. In regions where the two sequences are similar to each other, a row of high scores will run diagonally across the dot matrix. If you are comparing a sequence against itself to find internal repeats, you will notice that the main diagonal scores maximally, since it’s the 100 % perfect self-match. To make the scoring matrix more intelligible, the pairwise scores are averaged over a sliding window which runs diagonally. The averaged score matrix forms a three-dimensional landscape, with the two sequences in two dimensions and the height of the peaks in the third. This landscape is projected onto two dimensions by aid of grayscales—the darker gray a peak is, the higher it is. Dotter provides a tool to explore the visual appearance of this landscape, as well as a tool to examine the sequence alignment it represents.

20.2.4 GenBank As an example for sequencing we use the cloning vector pUC18c. As all open accessible sequences the sequence for this vector is available from GenBank hosted by the National Center for Biological Information (NCBI) at http://www.ncbi.nlm.nih. gov/gquery/gquery.fcgi.

20.3 Detailed Instructions The first step is to download all sequences and software we use in this practical. Start by opening a terminal window and create in your home directory a directory called Sequencing. Then change into the newly created directory.

1 2 3 4 5 6 7 8

Terminal 265: Make Directory with