Gene Cthe_2139 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagCthe_2139 
Symbol 
ID4811186 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameClostridium thermocellum ATCC 27405 
KingdomBacteria 
Replicon accessionNC_009012 
Strand
Start bp2539645 
End bp2542593 
Gene Length2949 bp 
Protein Length982 aa 
Translation table11 
GC content44% 
IMG OID640107543 
Productalpha-L-arabinofuranosidase B 
Protein accessionYP_001038536 
Protein GI125974626 
COG category[M] Cell wall/membrane/envelope biogenesis 
COG ID[COG5520] O-Glycosyl hydrolase 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clones13 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clonesn/a 
Fosmid unclonability p-valuen/a 
Fosmid Hitchhikern/a 
Fosmid clonabilityn/a 
 

Sequence

Gene sequence
ATGAATAATT TAAAAAAGTA TACATTGGTC GCTGTATTTG TTTTCCTGAC GGCAGTATGC 
TTTCAGCATC CTGGAATAAC CAGTGCTGCA ACAACAATTA CCATAGACCC TGATGCAACC
TATCAAACCA TCGAAGGCTG GGGTGCAAGC ATATGCTGGT GGGGAAATCA GATTGGCCGG
TGGTCTCCCG ATAACAGGAA CAGACTGATT GAGAAAATTG TCAGCCCGAC CGATGGACTG
GGATATAATA TATTCAGGTA CAATATCGGC GGCGGAGATA ATCCCGGTCA TAATCATATG
AGGGACTATG CTGATATTCA AGGATATCAG AATGCGGACC GTTCGTGGAA CTGGAATGCT
GATGCCGCCC AAAGGGCCGT TCTTACCCGG CTGATAGAAA GAGGGAGATA TTACGGATCG
GAAATTATAC TTGAAGCGTT TTCCAACTCT CCACCCTACT GGATGACGAA AAGCGGCTGC
GCTTCCGGAA CTTCCGACGG TTCCAACAAC TTAAGAGATG ACTGTTATGA CGATTTTGCC
GATTATCTTA CCGAGGTTGT AAAACATTTC AGGGATGCAT GGGGGATTAC CTTCAGGACT
CTGGAACCCA TGAATGAACC CAATTCAGAC TGGTGGAAGG CCGGTGGCAG GCAGGAAGGC
TGCTCCTTTT CCTACGCCAA TCAACAGAGA ATTATTAAAG AGGTTGGAGA AAAATTAAAA
GCAAAAGGGT TGACCGGGAC AACTGTCAGT GCTGCCGATG AAGCCAGTAT TGACACGGCC
CTGGAAGGGC TTCAGTCCTA CGACGCAACC ACTTTGTCCT ATATGTCCCA GCTTAATGTG
CATTCTTATT TCGGCTCGAA AAGAGCCCAG CTCAGAGACC TTGCAAAATC CAAAGGGCTC
AGAATCTGGC AGTCGGAATC CGGTCCGTTG AGTTTTAACG GCGATATGGC AGATTCATGT
ATAATGCTGT CAAAACGTAT TGTTACCGAC TTGAAAGAAT TACAGTGTGT GGCGTGGCTG
GACTGGCAAA TAATAGACGG GGGCAATTGG GGCAGTATAT ATGTGGATGA TGCGTCCCAG
ACCTTTACCC TTACCGAAAA ATTCTATATG CATGCAAATT ACAGCAGGTT TATAAGGCCC
GGTTATACAA TCATAGGTGC CAATAATGAA AAAACAATTG CGGCAATAAG CCCCGATAAA
AAGAAGCTTG TTATCGTCGC AACCAACGAC AATAAATCCT CCAGCGCAAA TTATACATTT
AACCTGACAA GGTTTTCAGG CGTAAACAGC ACTGTGGAGG TTTACAGAAC ATCTCCCAGC
CTAAGCCTTG CAAAAAGCAT TATAACAGCG TCGAATAAAA TTGTTTCCGA CACTTTGCCT
CCGTACTCGA TAAATACATA TGTCATAACC CTTGACGGTG GTGTTGAATC AGTACCGGCG
GTCGGGCTTC AATCCTATAA TTATCCCAAC AGATATGTAA GACACGCAGA CTTTGATGCA
AGGATAGATG AAAACGTAAC TCCTCTGGAA GATTCACAAT GGAGGCTGGT TCCGGGCCTT
GCCAACAGCA GCGAAGGGTA TGTTTCCATC CAGTCGGTAA ACTATCCTGG ATATTACTTA
AGACACTGGG ATTATGATTT CAGGCTTGAT AAGAACGACG GAACCACAAT TTTTGCTGAG
GATGCAACCT TTAAACTGGT TCCGGGCCTT GCAGACCCTT CCTGTGTTTC TTTTCAGTCA
TATAATTATC CCGACAGATA TATAAGGCAC TATGGATACT TGTTGAAACT CGAGAGAATT
TCAACCGATC TGGACAGGCA GGATGCAACC TTTTTAATAA TCAGCGATGA TTCTCCCGGA
CCTATAACAG ATTCAGGGTA TATAATGGCT TATTTCAAAC AGGCGCCGGG CGAGTATGGG
CTCAATCTTT GCTACAGTAC GGATGGGCTC CACTGGAGAA ATATAAACGA CGGAAAGCCG
GTTTTGTATG CGCAAATGGG AACAAAGGGA ATTAGGGATC CGTATATTTT CAGAAAGCAG
GACGGAAAGT TCGGAATAGT TGCAACCGAC ATGCTGGGGA CTAATTGGGG AGATACGAGC
CAGTATATTC ATTATTGGGA ATCCGAGGAT TTGATAAACT TTACCGAACG TTTAATAAAA
GTGCATAATA AAAGCAACAT GCATGCCTGG GCTCCGGAGG TATTCTATGA CGAGAACAGG
AAGCAGTACG GGATATACTG GGCAGGCAAT ACGGATTATA ACCGAATATA TGTAAACTAT
ACAACGGATT TTGATACAGT AAGTGACTGT CAAGTGTTTT TTGACCCCGG CTATGATGTA
ATTGATGCTC ACATTGTCAG TGACAAAGGA ATGTATTATT TGTTTTTCAA AGACGAGAGA
GCCTCGGGAA AAGCCATTAA GGTAGCCAGA TCAAGCTCCT TGACTCCCAA CAGCTTCACT
GTATTTACGC CGAATTTTAT AACCTCGCCC AATACGGAGG GACCCTTTGT TTTCAAAGAC
AATAATTCCG ACTCATGGTA TATGTATGTT GACATTTATT CCAACAACGG TATTTTTGAA
TGCTGGAAGA CAAACGATTT AAATGCCTTA AGCTGGACGA AGGTCACCGG AATCAGTGTA
CCGCCGGGAG TAAGACATGG AAGCGTTGTA AAGGTTAATC GATGGGAACT GGAAACTGCC
ATTTCCCGAA AAGTAGTTAC TCCACCTGCT CCAACACCGC CACCGGTGCT TAAAGGTGAT
GTAAATGCCG ACGGAGTGAT TAACTCATCG GACATAATGG TATTAAAAAG GTTTCTGTTA
AGAACAATAA CGCTTACGGA AGAAATGCTT TTAAATGCGG ACACCAATGG AGACGGTGCG
GTAAACTCTT CAGACTTCAC ATTGCTAAAA CGGTATATAT TGCGCAGCAT AGATTCTTTC
CCGGTTTAA
 
Protein sequence
MNNLKKYTLV AVFVFLTAVC FQHPGITSAA TTITIDPDAT YQTIEGWGAS ICWWGNQIGR 
WSPDNRNRLI EKIVSPTDGL GYNIFRYNIG GGDNPGHNHM RDYADIQGYQ NADRSWNWNA
DAAQRAVLTR LIERGRYYGS EIILEAFSNS PPYWMTKSGC ASGTSDGSNN LRDDCYDDFA
DYLTEVVKHF RDAWGITFRT LEPMNEPNSD WWKAGGRQEG CSFSYANQQR IIKEVGEKLK
AKGLTGTTVS AADEASIDTA LEGLQSYDAT TLSYMSQLNV HSYFGSKRAQ LRDLAKSKGL
RIWQSESGPL SFNGDMADSC IMLSKRIVTD LKELQCVAWL DWQIIDGGNW GSIYVDDASQ
TFTLTEKFYM HANYSRFIRP GYTIIGANNE KTIAAISPDK KKLVIVATND NKSSSANYTF
NLTRFSGVNS TVEVYRTSPS LSLAKSIITA SNKIVSDTLP PYSINTYVIT LDGGVESVPA
VGLQSYNYPN RYVRHADFDA RIDENVTPLE DSQWRLVPGL ANSSEGYVSI QSVNYPGYYL
RHWDYDFRLD KNDGTTIFAE DATFKLVPGL ADPSCVSFQS YNYPDRYIRH YGYLLKLERI
STDLDRQDAT FLIISDDSPG PITDSGYIMA YFKQAPGEYG LNLCYSTDGL HWRNINDGKP
VLYAQMGTKG IRDPYIFRKQ DGKFGIVATD MLGTNWGDTS QYIHYWESED LINFTERLIK
VHNKSNMHAW APEVFYDENR KQYGIYWAGN TDYNRIYVNY TTDFDTVSDC QVFFDPGYDV
IDAHIVSDKG MYYLFFKDER ASGKAIKVAR SSSLTPNSFT VFTPNFITSP NTEGPFVFKD
NNSDSWYMYV DIYSNNGIFE CWKTNDLNAL SWTKVTGISV PPGVRHGSVV KVNRWELETA
ISRKVVTPPA PTPPPVLKGD VNADGVINSS DIMVLKRFLL RTITLTEEML LNADTNGDGA
VNSSDFTLLK RYILRSIDSF PV