Gene Information Plasmid Coverage information Fosmid Coverage information Sequence |
Gene Information |
Locus tag | Cthe_2623 |
Symbol | |
ID | 4809045 |
Type | CDS |
Is gene spliced | No |
Is pseudo gene | No |
Organism name | Clostridium thermocellum ATCC 27405 |
Kingdom | Bacteria |
Replicon accession | NC_009012 |
Strand | + |
Start bp | 3100750 |
End bp | 3101898 |
Gene Length | 1149 bp |
Protein Length | 382 aa |
Translation table | 11 |
GC content | 46% |
IMG OID | 640108037 |
Product | exopolysaccharide biosynthesis protein |
Protein accession | YP_001039016 |
Protein GI | 125975106 |
COG category | [G] Carbohydrate transport and metabolism |
COG ID | [COG4632] Exopolysaccharide biosynthesis protein related to N-acetylglucosamine-1-phosphodiester alpha-N-acetylglucosaminidase |
TIGRFAM ID | |
|
|
Plasmid Coverage information |
Num covering plasmid clones | 13 |
Plasmid unclonability p-value | 0.102775 |
Plasmid hitchhiking | No |
Plasmid clonability | normal |
| |
Fosmid Coverage information |
Num covering fosmid clones | n/a |
Fosmid unclonability p-value | n/a |
Fosmid Hitchhiker | n/a |
Fosmid clonability | n/a |
| |
Sequence |
Gene sequence | TTGATGAAAA ACAAGGTTAA TATCGATACA ATTATATACT TGGTTTTAAT AATTACGTCC CTCATATTTA TTTATGCAGG GAGTTTTTTA TTTGAAGAGG GAAGTTTCAT TGGAAACAGA GGGAATAACA GGACAGTAGA TGCTGATACT GCCACTGATA GGAACAAACC GTTCCCTGTG GTGCACAAAG CTGTCAGTAC GGAAATTAAC GGAATAAAGC AGAAAATAAA CATCTTGGAG ATTGATTTGT CTTCAGGCGG TGTCAAAATA AAGCCGGCAT TGGCTTTTGA CACGATATAT GGTTTTCAGA GCCTTAAAGA TATTGCGATT AACAACAATG CTTATGCCGC CGTCAATGCA GGTTTTTTCT ATTCATACGG AGAGCCTTCG GGAATGGTGG TTATTGACGG GAAAGTTTAT ACGAAGTCCA CGGGAAAATA CCCGGTTTTT GTTGTACAGG GTAAAAATGC TTTCTTAAGC GAAATTAAGA GCAATATATG GATATTGCAC GGCAACAGAA GGATTGCGGC GGATGATATA AACAGAGAGG GAAAACCCGG TGAAACGGTG GTGTACACGC CGGTTTTCGG CCCTACAAAC CGGGCGGACA AGCTCCATAC ATCATATATT GTGGAGAATA ACAGAGTAGC AAGGAAATTT CGTGGGGATA CGGAGTGCAA AATTCCTTCG GACGGAATGG TAATTACCTT TTATGAGCCC ATTTCATCGG AAGAAAAATT TGAAGTGGGG GACTGGATCG GAATCGATAT TGATCCGGAT TTTGGGCCGG GATTTCAGGC GTATGAATGC GGAAGCTGGC TTGTAAGGGA TGGACAGGTG GTGGCTGTTG ACAGGGACGA CTGGGTCGGA CTTTTGACCA ACCGGGACCC GAGGACGGCC ATAGGAGTAA AGCATGACGG CAAGGTAGTG CTTGTTACCG TGGACGGGCG GCAGCCGGGA TACAGTGTGG GACTATCGTC GAGGGAGCTT GCAGGCTATT TGCTGACTTT GGGGATTAAA GACGCTGCCA TGCTGGACGG CGGAGCCTCA ACCCAGATGA TTGTACAAAA CAAGACGGTA AACAGGCTTC CTGCCAGGGA AAGAATGCTG GGCGGAGGAA TCGTGGTTGT TGTGGATGAG GATTTATAA
|
Protein sequence | MMKNKVNIDT IIYLVLIITS LIFIYAGSFL FEEGSFIGNR GNNRTVDADT ATDRNKPFPV VHKAVSTEIN GIKQKINILE IDLSSGGVKI KPALAFDTIY GFQSLKDIAI NNNAYAAVNA GFFYSYGEPS GMVVIDGKVY TKSTGKYPVF VVQGKNAFLS EIKSNIWILH GNRRIAADDI NREGKPGETV VYTPVFGPTN RADKLHTSYI VENNRVARKF RGDTECKIPS DGMVITFYEP ISSEEKFEVG DWIGIDIDPD FGPGFQAYEC GSWLVRDGQV VAVDRDDWVG LLTNRDPRTA IGVKHDGKVV LVTVDGRQPG YSVGLSSREL AGYLLTLGIK DAAMLDGGAS TQMIVQNKTV NRLPARERML GGGIVVVVDE DL
|
| |