Gene Cthe_2623 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagCthe_2623 
Symbol 
ID4809045 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameClostridium thermocellum ATCC 27405 
KingdomBacteria 
Replicon accessionNC_009012 
Strand
Start bp3100750 
End bp3101898 
Gene Length1149 bp 
Protein Length382 aa 
Translation table11 
GC content46% 
IMG OID640108037 
Productexopolysaccharide biosynthesis protein 
Protein accessionYP_001039016 
Protein GI125975106 
COG category[G] Carbohydrate transport and metabolism 
COG ID[COG4632] Exopolysaccharide biosynthesis protein related to N-acetylglucosamine-1-phosphodiester alpha-N-acetylglucosaminidase 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clones13 
Plasmid unclonability p-value0.102775 
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clonesn/a 
Fosmid unclonability p-valuen/a 
Fosmid Hitchhikern/a 
Fosmid clonabilityn/a 
 

Sequence

Gene sequence
TTGATGAAAA ACAAGGTTAA TATCGATACA ATTATATACT TGGTTTTAAT AATTACGTCC 
CTCATATTTA TTTATGCAGG GAGTTTTTTA TTTGAAGAGG GAAGTTTCAT TGGAAACAGA
GGGAATAACA GGACAGTAGA TGCTGATACT GCCACTGATA GGAACAAACC GTTCCCTGTG
GTGCACAAAG CTGTCAGTAC GGAAATTAAC GGAATAAAGC AGAAAATAAA CATCTTGGAG
ATTGATTTGT CTTCAGGCGG TGTCAAAATA AAGCCGGCAT TGGCTTTTGA CACGATATAT
GGTTTTCAGA GCCTTAAAGA TATTGCGATT AACAACAATG CTTATGCCGC CGTCAATGCA
GGTTTTTTCT ATTCATACGG AGAGCCTTCG GGAATGGTGG TTATTGACGG GAAAGTTTAT
ACGAAGTCCA CGGGAAAATA CCCGGTTTTT GTTGTACAGG GTAAAAATGC TTTCTTAAGC
GAAATTAAGA GCAATATATG GATATTGCAC GGCAACAGAA GGATTGCGGC GGATGATATA
AACAGAGAGG GAAAACCCGG TGAAACGGTG GTGTACACGC CGGTTTTCGG CCCTACAAAC
CGGGCGGACA AGCTCCATAC ATCATATATT GTGGAGAATA ACAGAGTAGC AAGGAAATTT
CGTGGGGATA CGGAGTGCAA AATTCCTTCG GACGGAATGG TAATTACCTT TTATGAGCCC
ATTTCATCGG AAGAAAAATT TGAAGTGGGG GACTGGATCG GAATCGATAT TGATCCGGAT
TTTGGGCCGG GATTTCAGGC GTATGAATGC GGAAGCTGGC TTGTAAGGGA TGGACAGGTG
GTGGCTGTTG ACAGGGACGA CTGGGTCGGA CTTTTGACCA ACCGGGACCC GAGGACGGCC
ATAGGAGTAA AGCATGACGG CAAGGTAGTG CTTGTTACCG TGGACGGGCG GCAGCCGGGA
TACAGTGTGG GACTATCGTC GAGGGAGCTT GCAGGCTATT TGCTGACTTT GGGGATTAAA
GACGCTGCCA TGCTGGACGG CGGAGCCTCA ACCCAGATGA TTGTACAAAA CAAGACGGTA
AACAGGCTTC CTGCCAGGGA AAGAATGCTG GGCGGAGGAA TCGTGGTTGT TGTGGATGAG
GATTTATAA
 
Protein sequence
MMKNKVNIDT IIYLVLIITS LIFIYAGSFL FEEGSFIGNR GNNRTVDADT ATDRNKPFPV 
VHKAVSTEIN GIKQKINILE IDLSSGGVKI KPALAFDTIY GFQSLKDIAI NNNAYAAVNA
GFFYSYGEPS GMVVIDGKVY TKSTGKYPVF VVQGKNAFLS EIKSNIWILH GNRRIAADDI
NREGKPGETV VYTPVFGPTN RADKLHTSYI VENNRVARKF RGDTECKIPS DGMVITFYEP
ISSEEKFEVG DWIGIDIDPD FGPGFQAYEC GSWLVRDGQV VAVDRDDWVG LLTNRDPRTA
IGVKHDGKVV LVTVDGRQPG YSVGLSSREL AGYLLTLGIK DAAMLDGGAS TQMIVQNKTV
NRLPARERML GGGIVVVVDE DL