Welcome to the Euglena genome project.
Euglena are a genus of protist capable of both heterotrophy and photosynthesis (autotrophy). Euglena are also capable of phagocytosis, possess two flagella (only one of which is involved in locomotion), have a flagellar pocket-like organelle (the reservoir), are phototropic and exhibit rather unique ‘euglenoid’ movement when encountering a solid substrate. They are distantly related to the trypanosomatids within the Excavata supergroup. The plastid has been sequenced, and there are ~20k EST sequences in the database, but no genome sequencing effort. Given the potential importance of Euglenids in terms of taxonomic position and unique biology for understanding many aspects of protist and evolutionary cell biology, we initiated a sequencing project, primarily for gene discovery and comparative genomics. We are using a combination of Illumina and 454 sequencing, together with mapping of multiple transcriptome datasets to train the assembly for gene prediction. We are anticipating a limited release of data for annotation purposes in spring 2015.
Who’s involved: Mark C. Field, Steve Kelly, ThankGod Ebenezer, Mark, Carrington, Michael Lebert, Michael Ginger, Julius Lukes, Andrew Jackson, Joel Dacks, Bill Wickstead, and Harry De-Koning.
Strain being sequenced:Euglena gracilisZ, kindly given by William Martin (Düsseldorf). DNA isolated using method of Medina-Acosta and Cross (1993). There is a restricted access to the data, and are only available by invitation or specific request. If you use the data, we do ask that you please acknowledge the source as follows; “E. gracilis genome data obtained from the sequence project at http://euglenadb.org/“.
Draft genome assembly statistics
Parameter* Euglena gracilis genome (draft)
# contigs 257242
# contigs (>= 1000 bp) 97509
Total length 639399673
Total length (>= 1000 bp) 564374092
Largest contig 246170
GC (%) 50.15
*All statistics are based on contigs of size >= 1 bp, unless otherwise noted (e.g., “# contigs (>= 0 bp)” and “Total length (>= 0 bp)” include all contigs).
Draft transcriptome assembly statistics
Parameter Euglena gracilis transcriptome (draft)
n seqs 176638
n bases 84279051
mean len 441.48
n over 1k 17780
mean orf percent 75.76