Title page for ETD etd-10182011-163946


Document Type Doctoral Thesis
Author Hefer, Charles Amadeus
Email charles.hefer@up.ac.za
URN etd-10182011-163946
Document Title Assembly, annotation and polymorphism analysis of a draft transcriptome sequence for a fast-growing Eucalyptus plantation tree
Degree PhD
Department Biochemistry
Supervisor
Advisor Name Title
Prof A A Myburg Co-Supervisor
Prof F Joubert Supervisor
Keywords
  • genomic research projects
  • bioinformatics
  • DNA sequencing technologies
  • Eucalyptus tree species
Date 2011-09-09
Availability unrestricted
Abstract
Ultra-high throughput DNA sequencing technologies have rapidly changed the face of genomic research projects. Technologies such as mRNA-Seq have the potential to rapidly profile the expressed gene-catalog of non-model organisms, albeit with significant bioinformatics related costs and support required. This study developed automated data analysis workflows focused on the quality evaluation of mRNA-Seq reads, de novo transcriptome assembly, transcriptome annotation and digital gene expression profiling making use of data analysis tools available in the public domain and novel tools developed for this purpose. The developed workflows were made available in a private instance of the Galaxy workflow management system. The developed workflows were used to perform the de novo assembly of a gene-catalog of a Eucalyptus plantation tree. The fast growing and good wood properties of Eucalyptus tree species and their hybrids make them excellent renewable resources of fiber for pulp and paper, and woody biomass for bioenergy production. We produced an expressed gene-catalog of 18 894 de novo assembled contigs from Illumina deep mRNA-Seq of six sampled plant tissues. Using a novel coverage-assisted re-assembly approach, we were able to assemble near full-length biologically relevant transcripts. The assembly was evaluated in terms of contig quality and contiguity, and functional annotations were assigned. Digital expression profiling (FPKM values) of each contig across the tissues were calculated, which was used to identify of tissue-specific sets of expressed genes. Polymorphism analysis of 13 806 high-confidence contigs revealed a combined exon and untranslated region SNP density of 0.534 SNPs/100 bp, which provides a good opportunity for designing high-density SNP assays in the expressed regions of the Eucalyptus genome. The assembled and annotated gene catalog was made available for public use in a user-friendly, web-based interface as the Eucspresso database (http://eucspresso.bi.up.ac.za). The developed database acts as a prelude to a more comprehensive mRNA-Seq whole-transcriptome repository, the Eucalyptus Genome Intergrative Explorer (EucGenIE), a resource that will focus on identifying transcriptional networks active during woody biomass development. Results from the study proved that current bioinformatics software tools and approaches can be used to successfully assemble and characterize a large proportion of the transcriptome of a complex eukaryotic organism. This approach can be used to characterise the gene catalog of a wide range of non-model organisms using only data derived from uHTS experiments.

2011 University of Pretoria. All rights reserved. The copyright in this work vests in the University of Pretoria. No part of this work may be reproduced or transmitted in any form or by any means, without the prior written permission of the University of Pretoria.

Please cite as follows:

Hefer, CA 2011, Assembly, annotation and polymorphism analysis of a draft transcriptome sequence for a fast-growing Eucalyptus plantation tree, PhD thesis, University of Pretoria, Pretoria, viewed yymmdd < http://upetd.up.ac.za/thesis/available/etd-10182011-163946 / >

D11/9/153/ag

Files
  Filename       Size       Approximate Download Time (Hours:Minutes:Seconds) 
 
 28.8 Modem   56K Modem   ISDN (64 Kb)   ISDN (128 Kb)   Higher-speed Access 
  00front.pdf 185.79 Kb 00:00:51 00:00:26 00:00:23 00:00:11 < 00:00:01
  01chapter1.pdf 375.42 Kb 00:01:44 00:00:53 00:00:46 00:00:23 00:00:02
  02chapter2.pdf 1.14 Mb 00:05:15 00:02:42 00:02:22 00:01:11 00:00:06
  03chapter3.pdf 1.63 Mb 00:07:33 00:03:53 00:03:24 00:01:42 00:00:08
  04chapters4-5.pdf 3.66 Mb 00:16:57 00:08:43 00:07:37 00:03:48 00:00:19
  05back.pdf 3.45 Mb 00:15:57 00:08:12 00:07:10 00:03:35 00:00:18

Browse All Available ETDs by ( Author | Department )

If you have more questions or technical problems, please Contact UPeTD.