Gene Cthe_2019 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagCthe_2019 
Symbol 
ID4810989 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameClostridium thermocellum ATCC 27405 
KingdomBacteria 
Replicon accessionNC_009012 
Strand
Start bp2397727 
End bp2399574 
Gene Length1848 bp 
Protein Length615 aa 
Translation table11 
GC content41% 
IMG OID640107429 
Producthypothetical protein 
Protein accessionYP_001038424 
Protein GI125974514 
COG category 
COG ID 
TIGRFAM ID[TIGR01445] intein N-terminal splicing region 


Plasmid Coverage information

Num covering plasmid clones
Plasmid unclonability p-value0.0000003618 
Plasmid hitchhikingNo 
Plasmid clonabilityunclonable 
 

Fosmid Coverage information

Num covering fosmid clonesn/a 
Fosmid unclonability p-valuen/a 
Fosmid Hitchhikern/a 
Fosmid clonabilityn/a 
 

Sequence

Gene sequence
ATGGCAACTC TAACGTTATA TGCCGGGAAA ATCAACCAAA TGCCCGGATT GATAAATGAA 
GTCAAGAAAT CTGTGGTGGA TTACAAGTCG GAGTTATCAG CATTAAAAAG GAAAACGCTC
AATATCAACA GAAGTGTATG CAATTTGGAT GAAGTAATAA GTTCCATACA GGCATCTTCC
CAGACTCAGG ATAGAAAAAT TGATTCACTT GAGAAATTCT GCAGTGAAAG CGAGAAGTTT
ATATCGGAAG TAGTACGTAT CGATGAAGAA GTTGCTGAGC TTATCAATAA ACGGAAAGAA
AATTTTTACA AAGAATATTA TTATTTAAAA CCGGAAAGCG AGAAAAGCGG CTGGGAAAAA
ATCAAGGACG GCTTAAAGTC GGTTGCGGAG TGGTGTAAAG AGAATTGGAA ATCCATTGCT
AAAATAGTAG TTGCGGCGGT AGTTATTGCA GGATTGGGGA TAGCGGCGGC TTTGACAGGC
GGGATATTGG GAGTCGTACT GGCAGGAGCA TTCTGGGGAG CATTGGCCGG AGGATTGATA
GGGGGAGCGG TTGGAGGAAT AGCCGCTGCG ATAAATGGAG GATCGTTTCT GGAAGGATTT
GCGGACGGGG CATTAAGCGG AGCAATTTCC GGGGCTGTGA CAGGAGCGGC ATGTGCCGGG
CTTGGTGCTT TGGGAGCTCT AGCAGGGAAA AGTATCCAAT GTATGAGCAC AGTGGGAAAA
GCGATAAATG TTACATCAAA GGTTACGGCA GCACTCTCGT TTGGTATGGA TGGATTTGAC
ATGCTGGCAA TGGGAGTATC ATTGTTTGAT CCATCCAATG CATTGGTTGA ATTTAACCGG
AAGCTGCATT CCAATGCATT TTATAACGGA TTCCAGATTG CTGTAAACGC GCTGGCTGTT
TTCACTGCCG GGGCGGCATC TACAATGAAG TGCTTTGTTG CAGGTACAAT GATATTGACT
GTGGCAGGCT TGGTTGCGAT AGAGAATATC AAGGCAGGGG ACAAGGTAAT TGCGACGAAT
CCGGAGACTT TTGAAGTAGC GGAAAAGACA GTGCTTGAGA CATATGTGAG AGATACGACG
GAGCTTTTGC ATTTGGCAAT CAATGGAGAG GTAATCAAGA CAACCTTTGA GCATCCGTTT
TATGTAAAAG ATGTGGGTTT TGTTGAAGCG GGAAAACTGC AAGTAGGAGA TAAGTTGGTT
GATTCAAAAG GCAATCTTTT GGTGGTGGAA GAGAAAAAGC TTGAGATAAC AGATGAACCT
GTTAAGGTTT ATAACTTCAA AGTGGATGAT TTTCATACTT ATCATGTTGG GAAAAAAGGG
ATATTGGTAC ATAATGCAGA CTATAACCCC AAAATGGGAT TTGATGATTT GGACCTTGAG
AAAGCTACGA ACAAACAAAA AGGCAATTAT GGAGAGTATC TGGCAGATGA TAATCTTATT
AATAATCCAA AATTGAAAGA AGCAGGGTAT GATTTGGAGC GGATAGGAGG TAAGGTTCCG
ACTTCACCGG ATGATAAAAT TACAAAAGGG ATAGACGGTA TATATATAAA CAAGAATCCT
AATTCAAATA TTAAATATGT GATTGATGAA GCAAAATTTG GAAAAGCGGG ACTGAGTGCA
AAGACAAGAG ATGGAAAACA AATGTCAGAT TCTTGGTTAA TGGGTTCTCG CTCAAGAGAT
AACAGAATTT TAAAAGCATT GAATAATAAT GAAGAGTTAG CAGATGATAT ATTGGAGGCA
TTGGCAAATA ATCAAGTAGA AAGAATATTG TCAAAAGTAG ATATAAATGG AGAAGTAACA
ACATACAGGC TGGATAGTGA GGGTAATATA ATTGGACTTT GGCCGTAA
 
Protein sequence
MATLTLYAGK INQMPGLINE VKKSVVDYKS ELSALKRKTL NINRSVCNLD EVISSIQASS 
QTQDRKIDSL EKFCSESEKF ISEVVRIDEE VAELINKRKE NFYKEYYYLK PESEKSGWEK
IKDGLKSVAE WCKENWKSIA KIVVAAVVIA GLGIAAALTG GILGVVLAGA FWGALAGGLI
GGAVGGIAAA INGGSFLEGF ADGALSGAIS GAVTGAACAG LGALGALAGK SIQCMSTVGK
AINVTSKVTA ALSFGMDGFD MLAMGVSLFD PSNALVEFNR KLHSNAFYNG FQIAVNALAV
FTAGAASTMK CFVAGTMILT VAGLVAIENI KAGDKVIATN PETFEVAEKT VLETYVRDTT
ELLHLAINGE VIKTTFEHPF YVKDVGFVEA GKLQVGDKLV DSKGNLLVVE EKKLEITDEP
VKVYNFKVDD FHTYHVGKKG ILVHNADYNP KMGFDDLDLE KATNKQKGNY GEYLADDNLI
NNPKLKEAGY DLERIGGKVP TSPDDKITKG IDGIYINKNP NSNIKYVIDE AKFGKAGLSA
KTRDGKQMSD SWLMGSRSRD NRILKALNNN EELADDILEA LANNQVERIL SKVDINGEVT
TYRLDSEGNI IGLWP