Gene Information Plasmid Coverage information Fosmid Coverage information Sequence |
Gene Information |
Locus tag | Cthe_2182 |
Symbol | |
ID | 4810898 |
Type | CDS |
Is gene spliced | No |
Is pseudo gene | No |
Organism name | Clostridium thermocellum ATCC 27405 |
Kingdom | Bacteria |
Replicon accession | NC_009012 |
Strand | + |
Start bp | 2599446 |
End bp | 2602235 |
Gene Length | 2790 bp |
Protein Length | 929 aa |
Translation table | 11 |
GC content | 43% |
IMG OID | 640107588 |
Product | Ig-like, group 2 |
Protein accession | YP_001038577 |
Protein GI | 125974667 |
COG category | [G] Carbohydrate transport and metabolism |
COG ID | [COG4632] Exopolysaccharide biosynthesis protein related to N-acetylglucosamine-1-phosphodiester alpha-N-acetylglucosaminidase |
TIGRFAM ID | |
|
|
Plasmid Coverage information |
Num covering plasmid clones | 7 |
Plasmid unclonability p-value | 0.951381 |
Plasmid hitchhiking | No |
Plasmid clonability | normal |
| |
Fosmid Coverage information |
Num covering fosmid clones | n/a |
Fosmid unclonability p-value | n/a |
Fosmid Hitchhiker | n/a |
Fosmid clonability | n/a |
| |
Sequence |
Gene sequence | ATGAAGTACA TAAGAAAAAG CATTTCCATC TTTGCAGTTT TGGCTCTTGT TTTTATTCTT TCATTTGGTA ATGTCAGCGC CGGTACTATT TATGAATCAA AAACAAAAGA AACAGTCACT TCCGGTGTGA CTTTGGAAAC CATCACAAGA TTTACCGATG ACGGATGGCA GAAAATCAAC GTACTCAGGG TTGACCTTGA GAACCCGAAC GTCAAAGTTG ACACTTTGAT TGATTCTGAG TCCATAAAAA AGCTTACCAA TGTAAAAAAT CTGGCACAAT CAGCCGGTGC AGTGGCTGCT GTAAACGCCG GCTTTTTCAA CTGGCTGAAA GGTGAATCCG GAAAGGGTTA TCCCGACGGA CCCGTAATAA AATCAGGGGA ATTTCTGTCT GTGGATTCCG AATACAACAG ATACAACAAT TCCATGGCAA CCATAGCCAT CGACAAAGAC AACAATGTGT ATTTTGATTT TTGGAAAACG GACATAACTC TGCATGCTCC GGACGGAAGC ACAACCAGAG TTTTTCAGTA CAATAAACCC AGTCCTTTCC AATATACCGA CATAACCATC TGGACCAGTG CCTGGGACAA ATATTCTTTA GGTGTATCCC AGCAATATCC TGACTTGGTG GAAGTAGTGG TCGACAACGG CACTGTTGTG GAAATCCGTC AGGGACTTCC CGCCGTTGAA ATTCCGCAGA ACGGTTATGT AATTATATCA AGAGGTGCAA ATGCGCAGTT TCTTCTGCAG CACTTTAAAG TGGGAGATCC CGTGGAAATC TCATTTTCCA CCGTATTGGA CTGGCAAAAG ATTGAAATGG CTGTAACAGG CAGCGCCATT CTCGTTAAAG ACGGCCAAAT ACCTGAAAAA TTCTCCTATG AAATCTCCGG AGTTCATCCC CGTACTGCCG CAGGAACTTC AAAATCCGGC AAGGAATTAA TACTTGTAAC TGTTGACGGT CGTCAGGCAG CAAGCAAGGG TATGACCCAG CGGGAGCTGG CAAATTTAAT GTTGAGCCTC GGAGCCTATA ATGCCATAAA TCTCGACGGC GGCGGATCCA CTTCAATGGT TTCAAGAATT CCGGGAACCA ATGACTTAAA AGTTGTAAAT ACTCCGTCCG ACGGGGCATT AAGAAGCATT TCCACCGCCA TAGGCGTGTT CTCCGTTGCG CCTCCCTCCG AACTTGCAGG CATGATAATC GAAACACAGC CCAATGTGTT TGTAAATTCG TCAACCACAT TAACCGTCAA AGGTTATGAT AAATACTTCA ATCCTGTATC AGTCGACCCA AACAGCATCA ACTGGAGCGT TTCCGGTATC AAGGGAAGCT TTGCGGGAAA TGTTTTCCGT GCTGAGTCTT CGGGTATCGG CACCATAACG GCAACAGTAA ACGGAATAAA GGCAAGTACA ACCATACGGG CTTTGGAAGC ACCAAACAAA CTTATTTTAA GCACCACAAA GCTCAATCTT TTGAAAGGAA GTTCATATTC TTTTACCGTC AAGGGCGTTG ACAACAACGG CTACAGTTCA TATATAAATT TTGCTGACAT CAACTGGACA GTCAATGGAG ATATAGCAAC CATTGAGAAA AATAAAATTA CTGCCGTAAA ACCCGGCACC GGATATATAG AAGCTGCTTT CGGCGATGCC CGGGCCTATT GTGCCGTATC GGTTGCATCG GAGTCTACGT ACCTTGTAGA CGATTTCGAA AAGAAAAACG GTTCATTTGA AACCGTGCCT GCGAATTTGC CGGGAAGCTA TGAACTTTCT TCAGAAGTAA AAAAATCAGG CAATTACTCC GGGAAACTCA GTTATGACTT TTCATATCTC GAGGGTACCA GGGCTGCATA CCTGGTGTTC CCCAACGGAG GCATTGACCT TGACGGCAAC ACTGTCGAAA TTACCATGAT GGTAAACAAT CCCCAGCAAA ACCCCAATTG GCTTAGAGCG GAAGTAATTG ACTCCGGCGG TCAAAAGCAT CTTGTTGATT TCACCAGGGA CCTTACCTGG ACCGGATGGG GCCGGGTATT TGCTTCACTG AGAGACATTA AGTCTCCGGC CAAGCTTACA AAAATATATG TTGTTCAGGT CAATCCTGTA CCTTCTTCCG GTTGTATTTA CATTGACGAC CTCAGTTTAG TCAAGGCAAC CTTCCCGGAA ATTAACGAAA GCACCATTCC AAAAGACGTT GTTCTTTCCG ACAGGGACAA CAAAGAGGCA GCTTTGAACA AAGATTCAAT CAAAATCTCC GTTTTTTCCG GAAAAAGCAA CCCCGAAAAC ATGTTGCAAA AGCTTTTAAA CACAAAATTC TACAACAAAG TTAAGGCTGA CGGTTACATG AAAAGCATTC AGGACATCTC AAACATAAAC ACCAATACCC ACGTGGATAT AGGCGGGACC CGACTGATTA CGCTCAACAC TACTGATAAA AGCATAAGAA CAAGTGCTTC AGGCCAATGG CAGTGGTTCT TTGACAAGCT TAATTCCCAC ACGGGAGATA ATATATTCAT ATTTATGAAA AACTCTCCGG ATACATTCGT TGACCCGCTG GAAGCAAAAT TGTTCAAGGA TATCCTGATT GAACACAAAG AAAAAACCGG AAAGAATATT TGGGTCTTCT ATGGCAACAG CTCTGAAACA TACTATGCCG ACAACGGAAT TAAATACTTT GGTGTTGCAG GATTAAACAT TGGCGGGCTT ACTCCCGACA ATGCGAAAAA CGTAAAATAC ATTGAAATAA CTGTAAATGG AAAAGAAGTA AGCTACCAAT ACAAATCTCC GCTTAAATAA
|
Protein sequence | MKYIRKSISI FAVLALVFIL SFGNVSAGTI YESKTKETVT SGVTLETITR FTDDGWQKIN VLRVDLENPN VKVDTLIDSE SIKKLTNVKN LAQSAGAVAA VNAGFFNWLK GESGKGYPDG PVIKSGEFLS VDSEYNRYNN SMATIAIDKD NNVYFDFWKT DITLHAPDGS TTRVFQYNKP SPFQYTDITI WTSAWDKYSL GVSQQYPDLV EVVVDNGTVV EIRQGLPAVE IPQNGYVIIS RGANAQFLLQ HFKVGDPVEI SFSTVLDWQK IEMAVTGSAI LVKDGQIPEK FSYEISGVHP RTAAGTSKSG KELILVTVDG RQAASKGMTQ RELANLMLSL GAYNAINLDG GGSTSMVSRI PGTNDLKVVN TPSDGALRSI STAIGVFSVA PPSELAGMII ETQPNVFVNS STTLTVKGYD KYFNPVSVDP NSINWSVSGI KGSFAGNVFR AESSGIGTIT ATVNGIKAST TIRALEAPNK LILSTTKLNL LKGSSYSFTV KGVDNNGYSS YINFADINWT VNGDIATIEK NKITAVKPGT GYIEAAFGDA RAYCAVSVAS ESTYLVDDFE KKNGSFETVP ANLPGSYELS SEVKKSGNYS GKLSYDFSYL EGTRAAYLVF PNGGIDLDGN TVEITMMVNN PQQNPNWLRA EVIDSGGQKH LVDFTRDLTW TGWGRVFASL RDIKSPAKLT KIYVVQVNPV PSSGCIYIDD LSLVKATFPE INESTIPKDV VLSDRDNKEA ALNKDSIKIS VFSGKSNPEN MLQKLLNTKF YNKVKADGYM KSIQDISNIN TNTHVDIGGT RLITLNTTDK SIRTSASGQW QWFFDKLNSH TGDNIFIFMK NSPDTFVDPL EAKLFKDILI EHKEKTGKNI WVFYGNSSET YYADNGIKYF GVAGLNIGGL TPDNAKNVKY IEITVNGKEV SYQYKSPLK
|
| |