Gene Cthe_2047 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagCthe_2047 
Symbol 
ID4811016 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameClostridium thermocellum ATCC 27405 
KingdomBacteria 
Replicon accessionNC_009012 
Strand
Start bp2436871 
End bp2438571 
Gene Length1701 bp 
Protein Length566 aa 
Translation table11 
GC content41% 
IMG OID640107452 
Producthypothetical protein 
Protein accessionYP_001038447 
Protein GI125974537 
COG category 
COG ID 
TIGRFAM ID[TIGR01445] intein N-terminal splicing region 


Plasmid Coverage information

Num covering plasmid clones14 
Plasmid unclonability p-value0.632657 
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clonesn/a 
Fosmid unclonability p-valuen/a 
Fosmid Hitchhikern/a 
Fosmid clonabilityn/a 
 

Sequence

Gene sequence
ATGGCAACTA TAAAATTATA TGCCGGTAAA ATTAACCAAA CACCTGAACT GATAAGGGAT 
GTAAAAAAGT CTGTAATTGA TTTTAAATCA GAGTTATCAG CATTAAAGAA GAAAACTCTA
AATATCAATA GAAGTGTATG CAATCTGGAC GATGTAACAA GCTCCATACA GGCGTCTTCC
CAGACCCAGG ACAGGAAAGT CACTTCTCTT GAAACAGTTT GTAAAGAAAC CGAAGAATTC
ATCTCGGAAG TAGTCAGTAT CGACGGCGAA GTGGCTCAGC TTATTAATGA ACGAAAAGAA
AATTTTTATA AAGAGTACTA TTACTTGAAA CCGGAAAATG AAAAAAGCGG CTGGGAAAAA
ATCAAGGACG GCTTAAAGTC GGTTGCGGAG TGGTGTAAAG AGAATTGGAA ATCAATTGTC
AAGATAGTGG CTGCCGCGGT AATTATTACG GGGTTAGGGA TAGGGGCAGC ATTGACAGGC
GGAGTATTGG GAGTTGTACT GGCAGGAGCA TTCTGGGGAG CATTGGCCGG TGGATTGATA
GGAGGAGCGG TCGGAGGAAT AGCTGCGGCG ATAAATGGTG GTTCATTTTT AGAAGGATTT
GCTGACGGGG CATTAAGCGG AGCAATTTCC GGAGCTGTGA CAGGATCGGC ATGTGCCGGG
CTTGGTGCTT TGGGAGCAGC GGTAGGAAAA GGCATCCAAT GCTTGAGTAC AGTGGGAAAG
GCGATAAATG TTACATCAAA AGTGACTGCA GCACTTTCCT TAGGTATGGA TGGATTTGAC
ATATTGGCGA TGGGGGTATC GTTATTTGAT CCGTCCAACG TGTTGGTTGA ATTTAACCAG
AAGCTACATT CCAATACACT TTACAACGGA TTTCAGATTA TAACTAATGC GCTGGCTGTT
TTCACTGCCG GGGCGGCATC AACAATGAAG TGCTTTGTTG CAGGCACGAT GATATTGACT
GCGACAGGTT TGGTTGCGAT AGAGAATATC AAGGCAGGAG ACAAGGTAAT TGCAACGAAT
CCAGAGACTT TTGAAGTAGC CGAGAAGACG GTGCTTGAGA CATATGTGAG AGAGACAACG
GAGCTTTTGC ATTTGACAAT TGGTGGAGAG GTAATCAAGA CAACCTTTGA TCATCCGTTT
TATGTAAAAG ATGTAGGCTT TGTCGAAGCA GGAAAACTGC AGGTAGGAGA TAAACTGCTT
GATTCAAGAG GCAATGTTTT AGTGGTGGAA GAGAAAAAGC TAGAGATTGC AGATAAACCT
GTTAAAGTTT ATAATTTTAA AGTAGATGAC TTCCATACTT ATCATGTTGG CGATAATGAA
GTATTGGTGC ATAATGCAAA TTATGTTGAA GGAGACTTAG ACGGTATTAC TATTATTAAT
AAGAAGTATG CAGGGCAAAC ATATAAGTTA AGTGGTGATT TAGCATTAAA GTATCCAGAT
GGTGTTAAAT TTACGAATGA AGGTTTTCCA GATTTTAGTC CCTATAGTAA GAAGACAGTC
AAAGTTGAAG GATTACAAGG TGACACATAC TATGATTTTA TTAAAGCTAA TCAAGCAGCA
GGATATAAAT CAACACCAAA AGGGTATACT TGGCATCATG TCGAAGATGG AATTACTATG
ATGCTTGTAC CATCTGATTT ACATGGAGCA GTGAAACATA CGGGTGGCGC TGCATTAATA
AGGAAGGGAA TAAGGCCATA A
 
Protein sequence
MATIKLYAGK INQTPELIRD VKKSVIDFKS ELSALKKKTL NINRSVCNLD DVTSSIQASS 
QTQDRKVTSL ETVCKETEEF ISEVVSIDGE VAQLINERKE NFYKEYYYLK PENEKSGWEK
IKDGLKSVAE WCKENWKSIV KIVAAAVIIT GLGIGAALTG GVLGVVLAGA FWGALAGGLI
GGAVGGIAAA INGGSFLEGF ADGALSGAIS GAVTGSACAG LGALGAAVGK GIQCLSTVGK
AINVTSKVTA ALSLGMDGFD ILAMGVSLFD PSNVLVEFNQ KLHSNTLYNG FQIITNALAV
FTAGAASTMK CFVAGTMILT ATGLVAIENI KAGDKVIATN PETFEVAEKT VLETYVRETT
ELLHLTIGGE VIKTTFDHPF YVKDVGFVEA GKLQVGDKLL DSRGNVLVVE EKKLEIADKP
VKVYNFKVDD FHTYHVGDNE VLVHNANYVE GDLDGITIIN KKYAGQTYKL SGDLALKYPD
GVKFTNEGFP DFSPYSKKTV KVEGLQGDTY YDFIKANQAA GYKSTPKGYT WHHVEDGITM
MLVPSDLHGA VKHTGGAALI RKGIRP