Gene Information Plasmid Coverage information Fosmid Coverage information Sequence |
Gene Information |
Locus tag | Cthe_2139 |
Symbol | |
ID | 4811186 |
Type | CDS |
Is gene spliced | No |
Is pseudo gene | No |
Organism name | Clostridium thermocellum ATCC 27405 |
Kingdom | Bacteria |
Replicon accession | NC_009012 |
Strand | + |
Start bp | 2539645 |
End bp | 2542593 |
Gene Length | 2949 bp |
Protein Length | 982 aa |
Translation table | 11 |
GC content | 44% |
IMG OID | 640107543 |
Product | alpha-L-arabinofuranosidase B |
Protein accession | YP_001038536 |
Protein GI | 125974626 |
COG category | [M] Cell wall/membrane/envelope biogenesis |
COG ID | [COG5520] O-Glycosyl hydrolase |
TIGRFAM ID | |
|
|
Plasmid Coverage information |
Num covering plasmid clones | 13 |
Plasmid unclonability p-value | 1 |
Plasmid hitchhiking | No |
Plasmid clonability | normal |
| |
Fosmid Coverage information |
Num covering fosmid clones | n/a |
Fosmid unclonability p-value | n/a |
Fosmid Hitchhiker | n/a |
Fosmid clonability | n/a |
| |
Sequence |
Gene sequence | ATGAATAATT TAAAAAAGTA TACATTGGTC GCTGTATTTG TTTTCCTGAC GGCAGTATGC TTTCAGCATC CTGGAATAAC CAGTGCTGCA ACAACAATTA CCATAGACCC TGATGCAACC TATCAAACCA TCGAAGGCTG GGGTGCAAGC ATATGCTGGT GGGGAAATCA GATTGGCCGG TGGTCTCCCG ATAACAGGAA CAGACTGATT GAGAAAATTG TCAGCCCGAC CGATGGACTG GGATATAATA TATTCAGGTA CAATATCGGC GGCGGAGATA ATCCCGGTCA TAATCATATG AGGGACTATG CTGATATTCA AGGATATCAG AATGCGGACC GTTCGTGGAA CTGGAATGCT GATGCCGCCC AAAGGGCCGT TCTTACCCGG CTGATAGAAA GAGGGAGATA TTACGGATCG GAAATTATAC TTGAAGCGTT TTCCAACTCT CCACCCTACT GGATGACGAA AAGCGGCTGC GCTTCCGGAA CTTCCGACGG TTCCAACAAC TTAAGAGATG ACTGTTATGA CGATTTTGCC GATTATCTTA CCGAGGTTGT AAAACATTTC AGGGATGCAT GGGGGATTAC CTTCAGGACT CTGGAACCCA TGAATGAACC CAATTCAGAC TGGTGGAAGG CCGGTGGCAG GCAGGAAGGC TGCTCCTTTT CCTACGCCAA TCAACAGAGA ATTATTAAAG AGGTTGGAGA AAAATTAAAA GCAAAAGGGT TGACCGGGAC AACTGTCAGT GCTGCCGATG AAGCCAGTAT TGACACGGCC CTGGAAGGGC TTCAGTCCTA CGACGCAACC ACTTTGTCCT ATATGTCCCA GCTTAATGTG CATTCTTATT TCGGCTCGAA AAGAGCCCAG CTCAGAGACC TTGCAAAATC CAAAGGGCTC AGAATCTGGC AGTCGGAATC CGGTCCGTTG AGTTTTAACG GCGATATGGC AGATTCATGT ATAATGCTGT CAAAACGTAT TGTTACCGAC TTGAAAGAAT TACAGTGTGT GGCGTGGCTG GACTGGCAAA TAATAGACGG GGGCAATTGG GGCAGTATAT ATGTGGATGA TGCGTCCCAG ACCTTTACCC TTACCGAAAA ATTCTATATG CATGCAAATT ACAGCAGGTT TATAAGGCCC GGTTATACAA TCATAGGTGC CAATAATGAA AAAACAATTG CGGCAATAAG CCCCGATAAA AAGAAGCTTG TTATCGTCGC AACCAACGAC AATAAATCCT CCAGCGCAAA TTATACATTT AACCTGACAA GGTTTTCAGG CGTAAACAGC ACTGTGGAGG TTTACAGAAC ATCTCCCAGC CTAAGCCTTG CAAAAAGCAT TATAACAGCG TCGAATAAAA TTGTTTCCGA CACTTTGCCT CCGTACTCGA TAAATACATA TGTCATAACC CTTGACGGTG GTGTTGAATC AGTACCGGCG GTCGGGCTTC AATCCTATAA TTATCCCAAC AGATATGTAA GACACGCAGA CTTTGATGCA AGGATAGATG AAAACGTAAC TCCTCTGGAA GATTCACAAT GGAGGCTGGT TCCGGGCCTT GCCAACAGCA GCGAAGGGTA TGTTTCCATC CAGTCGGTAA ACTATCCTGG ATATTACTTA AGACACTGGG ATTATGATTT CAGGCTTGAT AAGAACGACG GAACCACAAT TTTTGCTGAG GATGCAACCT TTAAACTGGT TCCGGGCCTT GCAGACCCTT CCTGTGTTTC TTTTCAGTCA TATAATTATC CCGACAGATA TATAAGGCAC TATGGATACT TGTTGAAACT CGAGAGAATT TCAACCGATC TGGACAGGCA GGATGCAACC TTTTTAATAA TCAGCGATGA TTCTCCCGGA CCTATAACAG ATTCAGGGTA TATAATGGCT TATTTCAAAC AGGCGCCGGG CGAGTATGGG CTCAATCTTT GCTACAGTAC GGATGGGCTC CACTGGAGAA ATATAAACGA CGGAAAGCCG GTTTTGTATG CGCAAATGGG AACAAAGGGA ATTAGGGATC CGTATATTTT CAGAAAGCAG GACGGAAAGT TCGGAATAGT TGCAACCGAC ATGCTGGGGA CTAATTGGGG AGATACGAGC CAGTATATTC ATTATTGGGA ATCCGAGGAT TTGATAAACT TTACCGAACG TTTAATAAAA GTGCATAATA AAAGCAACAT GCATGCCTGG GCTCCGGAGG TATTCTATGA CGAGAACAGG AAGCAGTACG GGATATACTG GGCAGGCAAT ACGGATTATA ACCGAATATA TGTAAACTAT ACAACGGATT TTGATACAGT AAGTGACTGT CAAGTGTTTT TTGACCCCGG CTATGATGTA ATTGATGCTC ACATTGTCAG TGACAAAGGA ATGTATTATT TGTTTTTCAA AGACGAGAGA GCCTCGGGAA AAGCCATTAA GGTAGCCAGA TCAAGCTCCT TGACTCCCAA CAGCTTCACT GTATTTACGC CGAATTTTAT AACCTCGCCC AATACGGAGG GACCCTTTGT TTTCAAAGAC AATAATTCCG ACTCATGGTA TATGTATGTT GACATTTATT CCAACAACGG TATTTTTGAA TGCTGGAAGA CAAACGATTT AAATGCCTTA AGCTGGACGA AGGTCACCGG AATCAGTGTA CCGCCGGGAG TAAGACATGG AAGCGTTGTA AAGGTTAATC GATGGGAACT GGAAACTGCC ATTTCCCGAA AAGTAGTTAC TCCACCTGCT CCAACACCGC CACCGGTGCT TAAAGGTGAT GTAAATGCCG ACGGAGTGAT TAACTCATCG GACATAATGG TATTAAAAAG GTTTCTGTTA AGAACAATAA CGCTTACGGA AGAAATGCTT TTAAATGCGG ACACCAATGG AGACGGTGCG GTAAACTCTT CAGACTTCAC ATTGCTAAAA CGGTATATAT TGCGCAGCAT AGATTCTTTC CCGGTTTAA
|
Protein sequence | MNNLKKYTLV AVFVFLTAVC FQHPGITSAA TTITIDPDAT YQTIEGWGAS ICWWGNQIGR WSPDNRNRLI EKIVSPTDGL GYNIFRYNIG GGDNPGHNHM RDYADIQGYQ NADRSWNWNA DAAQRAVLTR LIERGRYYGS EIILEAFSNS PPYWMTKSGC ASGTSDGSNN LRDDCYDDFA DYLTEVVKHF RDAWGITFRT LEPMNEPNSD WWKAGGRQEG CSFSYANQQR IIKEVGEKLK AKGLTGTTVS AADEASIDTA LEGLQSYDAT TLSYMSQLNV HSYFGSKRAQ LRDLAKSKGL RIWQSESGPL SFNGDMADSC IMLSKRIVTD LKELQCVAWL DWQIIDGGNW GSIYVDDASQ TFTLTEKFYM HANYSRFIRP GYTIIGANNE KTIAAISPDK KKLVIVATND NKSSSANYTF NLTRFSGVNS TVEVYRTSPS LSLAKSIITA SNKIVSDTLP PYSINTYVIT LDGGVESVPA VGLQSYNYPN RYVRHADFDA RIDENVTPLE DSQWRLVPGL ANSSEGYVSI QSVNYPGYYL RHWDYDFRLD KNDGTTIFAE DATFKLVPGL ADPSCVSFQS YNYPDRYIRH YGYLLKLERI STDLDRQDAT FLIISDDSPG PITDSGYIMA YFKQAPGEYG LNLCYSTDGL HWRNINDGKP VLYAQMGTKG IRDPYIFRKQ DGKFGIVATD MLGTNWGDTS QYIHYWESED LINFTERLIK VHNKSNMHAW APEVFYDENR KQYGIYWAGN TDYNRIYVNY TTDFDTVSDC QVFFDPGYDV IDAHIVSDKG MYYLFFKDER ASGKAIKVAR SSSLTPNSFT VFTPNFITSP NTEGPFVFKD NNSDSWYMYV DIYSNNGIFE CWKTNDLNAL SWTKVTGISV PPGVRHGSVV KVNRWELETA ISRKVVTPPA PTPPPVLKGD VNADGVINSS DIMVLKRFLL RTITLTEEML LNADTNGDGA VNSSDFTLLK RYILRSIDSF PV
|
| |