Gene Cthe_2854 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagCthe_2854 
Symbol 
ID4809134 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameClostridium thermocellum ATCC 27405 
KingdomBacteria 
Replicon accessionNC_009012 
Strand
Start bp3371390 
End bp3373237 
Gene Length1848 bp 
Protein Length615 aa 
Translation table11 
GC content40% 
IMG OID640108274 
Producthypothetical protein 
Protein accessionYP_001039246 
Protein GI125975336 
COG category 
COG ID 
TIGRFAM ID[TIGR01445] intein N-terminal splicing region 


Plasmid Coverage information

Num covering plasmid clones
Plasmid unclonability p-value0.00000844037 
Plasmid hitchhikingNo 
Plasmid clonabilitydecreased coverage 
 

Fosmid Coverage information

Num covering fosmid clonesn/a 
Fosmid unclonability p-valuen/a 
Fosmid Hitchhikern/a 
Fosmid clonabilityn/a 
 

Sequence

Gene sequence
ATGGCAACTA TAACGTTATA TGCCGGAAAA ATCAACCAAA TGCCCGGATT GATAAATGAA 
GTCAAGAAAT CTGTGGTGGA TTACAAGTCA GAATTATCCG CATTGAGAAA GAAAACTTTG
AACATCAACA GAAGTGTATG TAATTTGGAT GAAGTAATAA GTTCCATACA GGCATCTTCC
CAGACTCAGG ATAGAAAAAT TGATTCACTT GAGAAATTCT GCAGTGAAAG CGAGAAGTTT
ATATCGGAAG TAGTACGTAT CGATGAAGAA GTGGCTGAGC TTATCAATAA ACGGAAAGAA
AATTTTTACA AAGAATATTA TTATTTAAAA CCGGAAAGCG AGAAAAGCGG CTGGGAAAAA
ATCAAGGACG GCTTAAAGTC GGTTGCGGAG TGGTGTAAAG AGAATTGGAA ATCCATTGCC
AAGATAGTGG CTGCCGCAGT AGTTATTACC GGGTTGGGGA TAGCGGCGGC ATTGACAGGC
GGTATATTGG GAGTCATACT GGCAGGAGCA TTCTGGGGAG CATTGGCCGG AGGATTGATA
GGGGGAGCGG TTGGAGGAAT AGCCGCTGCG ATAAATGGAG GATCGTTTCT GGAAGGATTT
GCGGACGGCG CTTTAAGCGG AGCAATTTCC GGAGCGGTGA CAGGAGCGGC ATGTGCCGGG
CTTGGTGCTT TAGGAGCTCT AGCAGGGAAA AGCATCCAAT GTATGAGCAC AGTGGGAAAA
GCGATAAATG TTACATCAAA GGTTACGGCA ACACTTTCTT TTGGTATGGA TGGATTTGAC
ATGCTGGCAA TGGGAATATC ATTGTTTGAT CCATCCAATG CATTGGTTGA ATTTAACCGG
AAGCTGCATT CCAATGCACT TTATAACGGA TTCCAGATTG CTGTAAACGC GCTGGCTGTT
TTCACTGCCG GGGCGGCATC GACAATGAAG TGCTTTGTTG CAGGTACAAT GATATTGACT
GTGGCAGGCT TGGTTGCGAT AGAGAATATC AAGGCAGGGG ACAAGGTAAT TGCGACGAAT
CCTGAAACTT TTGAAGTAGC GGAAAAGACG GTGCTTGAGA CATATGTGAG AGAGACAACG
GAGCTTTTGC ATTTGACAAT CAATGGAGAG GTAATCAAGA CAACCTTTGA GCATCCGTTT
TATGTAAAAG ATGTGGGTTT TGTTGAAGCG GGAAAACTGC AAGTAGGAGA TAAGTTGGTT
GATTCAAAAG GCAATCTTTT GGTGGTGGAA GAGAAAAAGC TTGAGATAAC AGATGAACCT
GTTAAGGTTT ATAACTTCAA AGTGGATGAT TTTCATACTT ATCATGTTGG GAAAAAAGGG
ATATTGGTAC ATAATGCAGA CTATAACCCC AAAATGGGAT TTGATGATTT GGACCTTGAG
AAAGCTACGA ACAAACAAAA AGGCAATTAT GGAGAGTATC TGGCAGATGA TAATCTTATT
AATAATCCAA AATTGAAAGA AGCAGGGTAT GATTTGGAGC GGATAGGAGG TAAGGTTCCG
ACCTCACCGG ATGATAAAAT TACAAAAGGG ATAGACGGTA TATATATAAA CAAGAATCCT
AATTCAAATA TTAAATATGT GATTGATGAA GCCAAATTTG GAAAAGCAGG ACTGAGTGCA
AAGACAAGAG ATGGAAAACA AATGTCAGAT TCTTGGTTAG TGGGTTCTCG CTCAAGAAAT
AATAGAATTT TAAAAGCAGT AAGTAATAAT GAAGATTTAG CATTTGACAT AGTGAAAGCA
TTAAGAAATA ACCAAGTAGA AAGAGTATTA TCAAAGATAG ATGTAAATGG AAAAATAATA
ACATATAGAC TGGATAGCAA TGGTAATATA ATTGGACTTT GGCCTTAG
 
Protein sequence
MATITLYAGK INQMPGLINE VKKSVVDYKS ELSALRKKTL NINRSVCNLD EVISSIQASS 
QTQDRKIDSL EKFCSESEKF ISEVVRIDEE VAELINKRKE NFYKEYYYLK PESEKSGWEK
IKDGLKSVAE WCKENWKSIA KIVAAAVVIT GLGIAAALTG GILGVILAGA FWGALAGGLI
GGAVGGIAAA INGGSFLEGF ADGALSGAIS GAVTGAACAG LGALGALAGK SIQCMSTVGK
AINVTSKVTA TLSFGMDGFD MLAMGISLFD PSNALVEFNR KLHSNALYNG FQIAVNALAV
FTAGAASTMK CFVAGTMILT VAGLVAIENI KAGDKVIATN PETFEVAEKT VLETYVRETT
ELLHLTINGE VIKTTFEHPF YVKDVGFVEA GKLQVGDKLV DSKGNLLVVE EKKLEITDEP
VKVYNFKVDD FHTYHVGKKG ILVHNADYNP KMGFDDLDLE KATNKQKGNY GEYLADDNLI
NNPKLKEAGY DLERIGGKVP TSPDDKITKG IDGIYINKNP NSNIKYVIDE AKFGKAGLSA
KTRDGKQMSD SWLVGSRSRN NRILKAVSNN EDLAFDIVKA LRNNQVERVL SKIDVNGKII
TYRLDSNGNI IGLWP