High-Throughput Sequencing And Analysis Of'chromosome 1 Of Eimerl4 Tenella
Ling, King Hwa (2005) High-Throughput Sequencing And Analysis Of'chromosome 1 Of Eimerl4 Tenella. Masters thesis, Universiti Putra Malaysia.
Eimeria tenella is one of seven Eimeria species that causes avian coccidiosis. It is also highly pathogenic and is one of the three most common species occurring in the field. High-throughput sequencing of the 1.05M.b chromosomes one of Eimeria tenella (Houghton strain) revealed genomic information which may be usehl in the discovery of genes such as those involved in drug resistance, cellular regulation and integration, and mechanisms of invasion. High-throughput and random chromosomal shotgun sequencing resulted in 61 unfinished contigs representing full shotgun state of chromosome one nucleotide sequences which are ready into the finishing phase. Out of these contigs, 57 of them were arranged into 11 scaffolds whereas 4 contigs remain unordered. All the contigs represent 86.9% or 9.29-fold coverage of chromosome one. The quality of the assembly is assured with 94.1% of consistent paired reads with only 5824.3 (0.5%) errors expected. In addition, contiguity of the assembly vastly improved with the integrated BAC-end sequences and HAPPY map markers. Consensus level assessment showed 99.2% of the unfinished chromosomal sequence has expected error rate less than 1 per 10,000 bases (PHRAP score > 40) and only 7.6% of them need further polishing. The GC content of chromosome one is 49.35% and long-ranged excursions from its mean are found prominently in three regions whereas chromosomal wide GC fluctuations ranged from 35% to 60% at a 12kb window length analysis. GC skews were found to be correlated with the repeats rich regions of the chromosome. Telomeric sequence at both ends of the chromosome is derived as 'ITTAGGG / CCCTAAA with undefined real length. A centromeric like region with approximately 1,453bp was found in chromosome one with 81.3% AT composition. Chromosome one is expected to bear at least 25.3% of repetitive elements with the most prominent tandem repeat, TGC, which are distributed throughout the chromosome. The longest minisatellite, mstl, is 3,624bp in length and occurs as a single stretch in the chromosome. Besides that, there are a few under-characterized interspersed repeats such as LINE and DNA transposons which were found in the chromosome and preliminary homology-based gene survey demonstrated the possibility of LTR elements in SCl1. Both the GC skews and distribution of repetitive elements divide the chromosome into 7 prominent regions. Alignments with non-redundant and EST databases during gene survey gave a coarse estimation of coding densities of chromosome one at 1 CDS per IOOObp which also corresponded to 12.6% as coding and 87.4% as non-coding. Careful inspection on the distribution revealed that the coding sequences are centrically arranged within the chromosome. GC composition (53.9%) is higher in coding sequences compared to non-coding sequences (48.6%). The number of genes embedded in chromosome one is unknown until further laboratory investigations are carried out. Some of the significant hits may reflect the presence of the genes in chromosome one such as previously characterized LPMC-61 antigen, elongation factor Tu, proteophosphoglycan, proteases, and AAA ATPase family proteins that are involved in the parasite's mobility, parasite-host interaction and possibly invasion. However, in silico gene prediction using a homology-based technique identified three full length genes, phosphatidylinositol-4-phosphate 5-kinase (PIPSK), glucose- 6-phosphate isomerase (PGI) and ma1ate:quinone oxidoreductase (MQO). These genes served as gene models and provided early information regarding the intron, exon and splicing sites. The average exon and intron sizes were predicted as 118.5bp and 535.3bp, respectively. The most commonly utilized splice pairs is AG ... GT. Chromosome one nucleotide sequences have been deposited in the data depository of the Interim Laboratory of National Institute for the Genomics and Molecular Biology, BIOVALLEY-UKM, Bangi, Malaysia.
Repository Staff Only: Edit item detail