Gene Cthe_1795 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagCthe_1795 
Symbol 
ID4810040 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameClostridium thermocellum ATCC 27405 
KingdomBacteria 
Replicon accessionNC_009012 
Strand
Start bp2119801 
End bp2120826 
Gene Length1026 bp 
Protein Length341 aa 
Translation table11 
GC content46% 
IMG OID640107209 
Product3-deoxy-D-arabinoheptulosonate-7-phosphate synthase 
Protein accessionYP_001038209 
Protein GI125974299 
COG category[E] Amino acid transport and metabolism 
COG ID[COG2876] 3-deoxy-D-arabino-heptulosonate 7-phosphate (DAHP) synthase 
TIGRFAM ID[TIGR01361] phospho-2-dehydro-3-deoxyheptonate aldolase 


Plasmid Coverage information

Num covering plasmid clones21 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clonesn/a 
Fosmid unclonability p-valuen/a 
Fosmid Hitchhikern/a 
Fosmid clonabilityn/a 
 

Sequence

Gene sequence
ATGGTTATTG TAATGAAACC CAACTCAACG GAAAACGACA TCAACGAGGT AGCAAAGGTA 
TTGACTTCTT TGGGACTTGG GGTGCATATT TCAAAGGGTT CCGAAAGGAC TATTATCGGT
GTAATCGGCG ACAAAAGGAA GCTTTCCGAC GTACCTTTAG AGCTTATGAA CGGCGTCGAA
AAGCTGATTC CCATTGTGGA GTCATACAAG CTCGCAAGCA AAACTTTTAA GCCGGAACCC
AGTATAATCG ACGTCGGCGG TGTAAAAATC GGAGGCAAGG AAATTGTTGT CATGGCAGGT
CCCTGTGCCG TTGAAAGCAG GGAGCAGATT ATGGCTGCCG CCCAGGCAGT AAAAAAAGCC
GGTGCGCAGT TTTTAAGGGG CGGAGCTTTC AAGCCGAGGA CTTCCCCTTA TTCATTCCAG
GGGCTTGAAG AAGAAGGATT AAAACTTTTA AAAGAAGCAA AAGAAGCAAC CGGACTTCTG
ATTATAACCG AGGTTACCAG CGAAAGAGCC ATAGAAATAG CCGACAGTTA TGTTGACATG
TTCCAGGTGG GAGCGAGAAA TGTTCAGAAT TTCCAGCTTC TGCGCGAGAT TGGTCGCTCC
AAGAAACCTG TTTTGCTGAA AAGAGGTCCT TCAACCACTA TAGACGAATG GTTGAATGCG
GCTGAATATA TAATGAGTGA AGGCAATTAC AACGTTGTTC TTTGTGAAAG AGGCATAAGA
ACCTTTGAAA CGGCTACCAG AAACACACTG GATATCAGTG CGGTGCCTGT TGTAAAAAGC
TTGAGTCATC TTCCGATAAT TGTCGACCCA AGCCATGCGG CAGGAAAAGC CCAGTATATT
CTTCCTCTTT CAAAGGCGGC AATTGCCGCG GGCGCAGACG GACTTATCGT AGAAGTCCAT
CCGAATCCAA AATGTGCATT GTCCGACGCT GCCCAACAAC TTCCGCCGGA AGATTTCTGT
GAACTGTGTA AAGATATAAG TAAAATTGCC GAAATATTAG GAAGAGAGTT TCACTATGCA
GGTTGA
 
Protein sequence
MVIVMKPNST ENDINEVAKV LTSLGLGVHI SKGSERTIIG VIGDKRKLSD VPLELMNGVE 
KLIPIVESYK LASKTFKPEP SIIDVGGVKI GGKEIVVMAG PCAVESREQI MAAAQAVKKA
GAQFLRGGAF KPRTSPYSFQ GLEEEGLKLL KEAKEATGLL IITEVTSERA IEIADSYVDM
FQVGARNVQN FQLLREIGRS KKPVLLKRGP STTIDEWLNA AEYIMSEGNY NVVLCERGIR
TFETATRNTL DISAVPVVKS LSHLPIIVDP SHAAGKAQYI LPLSKAAIAA GADGLIVEVH
PNPKCALSDA AQQLPPEDFC ELCKDISKIA EILGREFHYA G