Gene Cthe_0004 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagCthe_0004 
Symbol 
ID4808817 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameClostridium thermocellum ATCC 27405 
KingdomBacteria 
Replicon accessionNC_009012 
Strand
Start bp3874 
End bp9702 
Gene Length5829 bp 
Protein Length1942 aa 
Translation table11 
GC content39% 
IMG OID640105414 
ProductYD repeat-containing protein 
Protein accessionYP_001036439 
Protein GI125972529 
COG category[M] Cell wall/membrane/envelope biogenesis 
COG ID[COG3209] Rhs family protein 
TIGRFAM ID[TIGR01643] YD repeat (two copies) 


Plasmid Coverage information

Num covering plasmid clones
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clonesn/a 
Fosmid unclonability p-valuen/a 
Fosmid Hitchhikern/a 
Fosmid clonabilityn/a 
 

Sequence

Gene sequence
ATGCAAATAA AAATAACTCC TGAAGAAATG ACAAGAATAG CAAGTAATAT AAAAGAAGTG 
TCAAAGAAAT TTGAAGATAT AACTCGTGAG GTAAAAAATA TAGTAAATTA CATTGATTGG
GAACTTAGAA GTAAAGAGGG AATAGAACAA AAACTATTAA TAGCAGATAC AGCAGCAAAA
AATATAGCCC AGGACTTAGA CAGAATGTCA CGGGATCTTA TAAGTGCAAG GGACAAAATG
ATGGAAGCAG AAGACAAGGC TGCAATTGCA GCAAGAAAAA AGAGAAATAT AAACTTCGAA
AATTTAATTC ATGATGCTCT TGAGGTACTT TTTAGGCCAC CAGTCGGAGC AATATTCAGG
TTATTTAACT ATTTGCCGTG GGATAGACTG GTAGGCTCCG GAACCGCAAA TTGTCCCAAC
ACTTTTGCGG GGGACCCTGT AAACGTGGTA TCGGGAAATT TTTATTTAAC AAGAAGAGAT
ATAACCATAC CTTCAAGAGG TATGGCGCTT GAGATAACCA GATATTACAA CTCCATGGAT
AATACGCCAG GTATGTTCGG TAAAGGATGG AAAACAGATT ATGAAACATG CCTGAAGAAA
AAGGAAGACA GTGAGGACAT AATAGTGATG TATCCGGGAG GGAGCATAAG GATATTCGAA
CATACGGGGT CAGGAACTTT CAAATCTCCC AAAGGTGTGT ACGATACACT TTTTAAAACA
GAGGATGAAA TGTACATATT GAAGGTTCAA AAGGGGATTA CCTACAAATA TGACCAGGCT
GGAAGCCTTG TATCAATCTC GGATTCAAAC AATAACGAGA TAAGATTTAA ATATAACCGG
GAGGGATTGC TGTCTGCCAT AATGTCACCG AGGGGAAAAC TTTTGATGTT TTCCTATGAA
AGCGACAGGG TTGGCAGCGT AACTGACCAC ACGGGAAGGA AGCTGAGATA TAAATACGAT
GAAAAAGGAA ATCTGATACA GGTAATATAC CCTGACGGAG GAAAAATTAC CTATGCGTAC
GATAACATAG GGCTAATTTC AATAACCGAC CAGAATGGCA ACACCTATGT TCAAAATACA
TATGATGAAA AAGGCAGAGT AGTAAAACAG CTTGACCATG AGAATAATGA GTTGATTATA
GAATATGACG AAGAAAATCG TGAAAATACT TTCAAATGGA CGAAAAGCGG TATAACCCGT
GTGTATAAGT ACAATGAGAA GATGCTTCTG ACCGAAATAA GGTATGATGA CGGAAGTGTG
CAAAAATATA CCTATGATGA GAATTTCAAC AGAAACAGTG AGACGGACAG AAATGGCAAC
ACGACGTACA GGAAATATGA TGACAAGGGC AATTTAATAG AAGTAATTTC ACCGGAGCCT
TTCTGCTATA AGACAAAATA CAGCTATGAC GAGGAAGGCA GACTGATAAA GGTAGTGTCA
CCGGGCGGGG GAGAAGTGTC TTTTGAATAT GACGAAAGGG GAAACCTTTT AAAACGCATT
GTAAAGACCG GAAGCAGAAG TTATTCGGAG TGGGCATATA CGTATGATCA ATATGGAAGA
ATGACAACAT CAAAAGATGC GGAAAACAAC ACGAAGACCT TTGAGTATGG AGAAGAAGAT
GTAAACAAAC CGACATTGAT AAAAGATGCG GTAGGGAACA TATTTAAATA TGAATTTGAC
AAAGTGGGTC GGGTAGTGGC CACAACCACA CATTACGGAA CAGTAAGGTT AAAATATGAT
GAGTGTGACC GGATAACCCA CATAACCGAT ACAGAAGGGA ACACAACAAG AATCTGCTAT
GACAAGGCAG GAAACATGAC AAAGGTTATA GCGCCGAAGC AGTATGGGGA GAAAGGCGAA
AACGGGTCAG GATATGCATT TGAATATAAT GCAATGGACA AGCTTATAAG GACAATTGAC
CCGTTGGGCA ATGTTTTTGC GGTAAAATAT GATGAGAACG GCAACAAGAT AAAAGAGATC
AACCCGAACT ACTATAGTTC TGAGAAAGAT GACGGTATAG GAATAGAATA CAAATATGAC
ACCAACCACC GCAGGATAAG CACGATATTC CCGGACGGAA GCATGTCGAG GATAAAGTAT
GACGCGGAAG GCAATATAAT AAAGACGATA TCCTGGAAGG ATTATAACAA GGATTTGGAT
GACGGGCCGG GGATGGAGTA TACCTATGAT GAAATGAACA GGCTTACGCA AATAATAGAC
CCGAAAGGGA ATGTAATAAA GAAATACATA TACGATGAAG ACGGAAGAAT CGTGAAAGAA
ATAGATGCAA AAGGATATAG CAGTGCAGAT AACGATGAAG AACGTTGGGG TACAATATAT
AAATACAACC TTGCCGGATG GCTTGTTGAG AAGAGGACAC CGTTACAGCA GAAAAATGGT
GAAATATATT ACAACATAAT AGAATATGTG TATGACAGAA ACGGAAGGGT AGTACAGGAG
AAAAGATCTC CGGAATATGT GACCAGGACA GGATATCCAA AGAAATGGAA CATAATAAAC
TATAAATATG ATCCCAACGG AAATCTGATA GAGGTAACCG ACAGCCTTGG AGCAGTGATA
ACCTATGAAT ATGACTGCTT TGGCAAGAGA ACACTGGAGA GAATGAAAAT AAACGACAGG
AAGCAAAGAG TAACCAGATA TGAATATAAT GGAGTGGGAA AATTAACGAG AGTAATACGT
GAATTGGACG GAGAAGACCT TTCAGGATAC AGTGAAGATA AGGTTTTGGC GGAGACAATA
TACAATTACG ACCCAAACGG CAATCTTATA GAGGTGATTT CTCCTGAAGG GTATTTGACT
GTATTTAAAT ACGATGATGC AAACCGTAGG ATAAAGAGTA TATTGTATCA ACCGCAGAAC
GGTGTGAAAC TGAGCGGCAG TGCGTATTGT GCCCTTTTAA ATACAAAGTC GAGGAGCATA
AGTTATGAGT ATGACCGGGC AGGGAACCTT GTAAGAGAGA TATTGCCGAA CGGTGGCGTC
ATAATAAACG AATATGATGA AATGAACAGA AGAATAAGAG TTACCGACCC TGACGGAAAC
ACCAGAAGGA TTTTCTATGA CAATTCAGGG AACGTCGTAA AATATGTTAA TCCGGAGAAT
TACGATCCGG AGAAAGATGA CGGAACAGGT ACCACATACC TTTATGACTC AATGAACCGT
CTTATAGAAA TAGTAAATGC AGCGGGTATA GTAGTGGAAA GGAATATATA CAACACAGCG
GGAGAGATAA TCAAGAGGAT AGACTCAGTT GGTTATAGTT CCGCAGATAA TGACAATGAC
AGGCATGGAG TTGAATTTAG TTATGACCTG GCGGGACGTT TGGAGGAGAT AACAACGCCG
GAAGCGAAGA TTCATGGCCG AAAGAGTCAG AAATACACAT ATGACGCAGA AGGAAACATA
ACAGGAGTAG TTGACGGAAA CGGAAACAGC ACAAGGTACA GTTTGGACTT ATGGGGTAAG
ATAATAAACA TAACGGAACC TGACGGAACC AATATAAAAT ACGATTATGA CTATGCGGGA
AATCTTGTAT CCACTACTGA CGGTAACGGA AACACTACCC GTTATACATA CAACAGCTTT
AACCTTCTGT CGGAGATAAT AGATCCTGAC GGAAGGAAAA TAACCTTCAA GTATGACAGA
CAGGGAAGAA TGGTGCAAAG GATAGGGAAA GACGGACGCA GCACATATTA TAATTACAAT
GCGGATAACA ATATAACCGG GCGTTGGGAA GAAGAAGGGC AGATGGAAAA ATACGAGTAT
AATGTAGACG GAAGCCTGGC TGCGTCAATA AGCGGTACTA CTATACATAC TTATGCCTAT
ACCTTGGCAG GAAGGCTGAA AAGTAAGACA ACCAATGGAC AGAAGGTATT GGAGTATGAT
TACAATAAGA ATGGGCTTGT ATCGAAAGTT ACCGATATAA GTGGAACACC GGTGGAGTAT
ACATATGACG TACTGGGGAG ATTAACAACG GTAACAAACG GAGGCAAAGT TTCTGCGAGG
TATGAATACA ATATTGACAA CACAATAGCG CAGGTGTTGT ATGGGAGTGG AGTATGCGCT
AGATATGAAT ACGACATGGA TAAGAAGATA AAAGAGCTTT TAAACATAGA CCCGACAGGG
AAAGAAATGT TTGTATACAG GTATGCGTAT GATGGGAACG GCAATCCAAT TTTGAAAGAA
GAGAACGAAA AAGTAACGGC TTACAGTTAT GATACGCTGA ACCGTTTGAA AGAAGTAGTA
TACCCTAGGA ATATAAGAGA GAGGTTTGAA TATGATGCGA ACGGCAACAG GATCAAAAGA
GAATGTGGAG ATATACTGGA ACAGTATGAA TATGACAGTT GCAATAGATT GGTTCAAAGA
ATAAAGAACG GGCTGTTAAC GGAATATGAG TATGATGCGA GGGGAAATTT GATAAAAGAA
AAAGAGGGTG AGTTGACTAA ATTATACAGC TATGACGGAT TTGACAGACT GATACGTGTA
CAAAATCCGG ACGGAACATA TATGGAAAAT ATATACGATG CCGAGAATTT GAGAACGGTC
TCGATAGAAA ACGGTAGGTA CAACAGGTAT GTGTACAACG GAAGAAATAT AGCGTGTGAA
GTAGACGAGG ATTGGAGTCT AAAAGACAGA ATAGTCTTTG GGCATACGAT ATTACAAAGA
GAAGACAGTG ACAAGAATGA GTATTATTAT ATTCACAATG CCCATGGGGA TATTACAGCT
CTTACCGATG GGAAAGGAGA AGTAATAAAC AGCTACAGTT ACGATGCTTT TGGAAATATA
TTGGACAGTG TTGAGAAGAT AGAGAACAGA TTCAAGTATT CGGGAGAAGT GCTTGATCCT
GTGACGGGAC AGTACTACCT GAGAGCGAGA TATTATAACC CAAGCATAGG AAGGTTTATG
CAGGAAGATA CGTTTAGGGG TGACGGACTA AACTTATATA CCTATGTTGC CAACAATCCA
CTAAAGTACG TTGACCCAAC CGGTCATTGT AAAGAGAGTG TTGATTTTAG TGATGTATAT
GACAACATTT TAAGTGAAAA TCCTAATGAC ATTTTGAAAG ATATGGTATT AAGAAAAATA
ATGGGACCGG ACTGGACTCC AAACTATTTG AGAAATTCAA ATGAAATAGA GGATTATAAG
TTTAAGGCAG TAATATATCT AAATAGAAGT GACGGTGCTT TTTTGCAGGG ACATTCAGCG
ATAATGCTTG TAACCGATGA TAACCAAGGG TTGTTCTATA GCTTTGTGGG TGATGCAAGT
AAAACACTTC AGCTTATAGC AGGATTTAAT TCGCCGGGGA AAATATTAAA ACCATTTGAT
AAACGAAAAA AACAATATGT TACAGTCGAC GTTATTAGTT TTTTAGAAAA AGGAAGAATT
GAAAATGTGG AAAAGTTTAA AGGGAATGAC ACATTTACTG ATCAATATGA TAGATATATT
TATATACCTA TAACAAATGA GCAGGGGCAG GCTATGTACA GGAAAGCGGA GATGTTATAT
AAACAGCCTC CGGAGTATAA TCTCTATGCA AATAATTGTA ATCATGTAGC ACAACAAATA
TTGGAAGCAG GAGGTTTGAA TTTTGCACCG ACAAAAGGGA ATGCATTAGA TGAACGTATA
AATTTATATA CATCTTTTAA TCCTCTTGTT CATTTACCGC CAAGAGCAAT GCTGAATTAT
ATTTTTGACA GGGTAGACAA GACGATACCT AATGCTGCAT ATAATTACGG CGCTTTTATT
GCGAACAAGC AAGGTTGGAT TGCGGGAAAT ACGGATGATA GTGCATTTTT TTGGAATCCG
TTTAAGTAA
 
Protein sequence
MQIKITPEEM TRIASNIKEV SKKFEDITRE VKNIVNYIDW ELRSKEGIEQ KLLIADTAAK 
NIAQDLDRMS RDLISARDKM MEAEDKAAIA ARKKRNINFE NLIHDALEVL FRPPVGAIFR
LFNYLPWDRL VGSGTANCPN TFAGDPVNVV SGNFYLTRRD ITIPSRGMAL EITRYYNSMD
NTPGMFGKGW KTDYETCLKK KEDSEDIIVM YPGGSIRIFE HTGSGTFKSP KGVYDTLFKT
EDEMYILKVQ KGITYKYDQA GSLVSISDSN NNEIRFKYNR EGLLSAIMSP RGKLLMFSYE
SDRVGSVTDH TGRKLRYKYD EKGNLIQVIY PDGGKITYAY DNIGLISITD QNGNTYVQNT
YDEKGRVVKQ LDHENNELII EYDEENRENT FKWTKSGITR VYKYNEKMLL TEIRYDDGSV
QKYTYDENFN RNSETDRNGN TTYRKYDDKG NLIEVISPEP FCYKTKYSYD EEGRLIKVVS
PGGGEVSFEY DERGNLLKRI VKTGSRSYSE WAYTYDQYGR MTTSKDAENN TKTFEYGEED
VNKPTLIKDA VGNIFKYEFD KVGRVVATTT HYGTVRLKYD ECDRITHITD TEGNTTRICY
DKAGNMTKVI APKQYGEKGE NGSGYAFEYN AMDKLIRTID PLGNVFAVKY DENGNKIKEI
NPNYYSSEKD DGIGIEYKYD TNHRRISTIF PDGSMSRIKY DAEGNIIKTI SWKDYNKDLD
DGPGMEYTYD EMNRLTQIID PKGNVIKKYI YDEDGRIVKE IDAKGYSSAD NDEERWGTIY
KYNLAGWLVE KRTPLQQKNG EIYYNIIEYV YDRNGRVVQE KRSPEYVTRT GYPKKWNIIN
YKYDPNGNLI EVTDSLGAVI TYEYDCFGKR TLERMKINDR KQRVTRYEYN GVGKLTRVIR
ELDGEDLSGY SEDKVLAETI YNYDPNGNLI EVISPEGYLT VFKYDDANRR IKSILYQPQN
GVKLSGSAYC ALLNTKSRSI SYEYDRAGNL VREILPNGGV IINEYDEMNR RIRVTDPDGN
TRRIFYDNSG NVVKYVNPEN YDPEKDDGTG TTYLYDSMNR LIEIVNAAGI VVERNIYNTA
GEIIKRIDSV GYSSADNDND RHGVEFSYDL AGRLEEITTP EAKIHGRKSQ KYTYDAEGNI
TGVVDGNGNS TRYSLDLWGK IINITEPDGT NIKYDYDYAG NLVSTTDGNG NTTRYTYNSF
NLLSEIIDPD GRKITFKYDR QGRMVQRIGK DGRSTYYNYN ADNNITGRWE EEGQMEKYEY
NVDGSLAASI SGTTIHTYAY TLAGRLKSKT TNGQKVLEYD YNKNGLVSKV TDISGTPVEY
TYDVLGRLTT VTNGGKVSAR YEYNIDNTIA QVLYGSGVCA RYEYDMDKKI KELLNIDPTG
KEMFVYRYAY DGNGNPILKE ENEKVTAYSY DTLNRLKEVV YPRNIRERFE YDANGNRIKR
ECGDILEQYE YDSCNRLVQR IKNGLLTEYE YDARGNLIKE KEGELTKLYS YDGFDRLIRV
QNPDGTYMEN IYDAENLRTV SIENGRYNRY VYNGRNIACE VDEDWSLKDR IVFGHTILQR
EDSDKNEYYY IHNAHGDITA LTDGKGEVIN SYSYDAFGNI LDSVEKIENR FKYSGEVLDP
VTGQYYLRAR YYNPSIGRFM QEDTFRGDGL NLYTYVANNP LKYVDPTGHC KESVDFSDVY
DNILSENPND ILKDMVLRKI MGPDWTPNYL RNSNEIEDYK FKAVIYLNRS DGAFLQGHSA
IMLVTDDNQG LFYSFVGDAS KTLQLIAGFN SPGKILKPFD KRKKQYVTVD VISFLEKGRI
ENVEKFKGND TFTDQYDRYI YIPITNEQGQ AMYRKAEMLY KQPPEYNLYA NNCNHVAQQI
LEAGGLNFAP TKGNALDERI NLYTSFNPLV HLPPRAMLNY IFDRVDKTIP NAAYNYGAFI
ANKQGWIAGN TDDSAFFWNP FK