Gene Cthe_0174 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagCthe_0174 
Symbol 
ID4808662 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameClostridium thermocellum ATCC 27405 
KingdomBacteria 
Replicon accessionNC_009012 
Strand
Start bp209822 
End bp211714 
Gene Length1893 bp 
Protein Length630 aa 
Translation table11 
GC content38% 
IMG OID640105585 
Productsulfatase 
Protein accessionYP_001036608 
Protein GI125972698 
COG category[M] Cell wall/membrane/envelope biogenesis 
COG ID[COG1368] Phosphoglycerol transferase and related proteins, alkaline phosphatase superfamily 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clones11 
Plasmid unclonability p-value0.326693 
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clonesn/a 
Fosmid unclonability p-valuen/a 
Fosmid Hitchhikern/a 
Fosmid clonabilityn/a 
 

Sequence

Gene sequence
ATGAAGATCA GTATAGAAAG AGCAAGAAGC AATATTTTTA CCAATACAAG ACCGCAGCTG 
GATGTGTTTG GAATAATTTA TACTGTGCTT TTTGTATGTT CCATTGTATT TAAAGGGGTG
TTTCTCCAAT TTCAGAACCA AATTAATTTC AAACCTCTTT TTTCAACCAC AAATATTTTC
ATGTTTGTTG CTTCAATGTC TTTTACATTG GTACTGGCGG CTTTGTTGAC CGTTTTTCAC
ACAAAGAGAA GAGTGTTGTT TTTCATATCC AACATTTTAA TGTCGGTTTT GCTTCTTTCT
GATGCTCTGT ATTTGCGCTA TTACAACACG ATAATAACAA TACCGGTAAT TTATAATGCC
CGATATTTGG GGCCGGTCAG AGAGAGTATC ATGAGTCTTT TCAGGTTTAG CGACATATTC
TATTTTTTGG ATATTCCTGT TTTTGCAGTA ATGTCGTTTA TATTTTCCAA ACGGGCTGAA
CAGAACAAGC TTCCGTTGCT GAAAAGATGC GTAGTGGCCG CAGTGCTGAT GGTAGTAGCT
TTTGGCTCTT TTAAAATAGC ATACAGCAAA AATGACATGT CCGAGTACGA CAACAATTAT
ATTGTGAAGA ACTTTGGCAT AGGTTATTTC CACTATTATG ATGTGAAAAA ATATTTAAAG
GAAAATTATC TTAGGGATAA AAAACTTAGA ACTGAGGAGA AAAATGAACT GACATCCTTC
TTTGAAGAAA AAAACAAGGA AAAAGCCGCA CTTTCCAATA GATTTAAGGG AATAGCAAAA
GGGAAAAACC TTATTATTGT TCAGATGGAG GCTCTTCAGC ATTTTGTTAT CAACAGCAAA
ATGAACGGCA GGGAAATAAC TCCCAACTTA AACAAGCTTG TAAAGGAAAG TCTGTATTTT
GACAATATCT ATGTGCAGGT GGCAGGGGGT AATACGTCGG ATGCCGAGTT TATGACCAAT
ACTTCATTGT ACCCTGCAAA AGAAGGTGCT GCCTATTTTA GATTTGCAAC AAACGAGTAT
AACACCATTC CCAAGGAATT AAAGAAAGAA GGCTATAATT CCTACGCTTT GCATGCATAC
GGACCTGCAT TCTGGAACAG AACCGAAATG TACAAAGCTA TAGGATTTGA TACTTTTATA
AGCTCTAACG ACTATGTTAT GGACGAATAT ATAGGCTGGG GAGGCTGGGC GTTAAGTGAC
GATTCGTTTT TCAGACAGTC TCTGGAGAAA ATTGATGTCA CCAAACCGTT CTATTCATTT
TTCATAACTC TTTCCGGTCA TCATCCTTAT TCCTATTTTG AGGATAAACA AACCTTTGAT
GTCGGAAAAT ATGACAGGAC TTATTTCGGC AACTATATTA AGGCTCAGAA CTATGCTGAT
GCCGCTCTTG GCCGTTTTAT AGAAAGGCTT AAAGAAATGG GTCTTTATGA GAACAGCCTT
ATTGTCATCT ACGGCGACCA TACAGGTCTT CCCAAGACTC AGGCAAAAGA ACTTCTGGAA
TTTTTGGGAG TGGACGACAA CAAGGTTGAC TGGATAAAGC TTCAAAAGAT ACCTTTGCTG
ATACATTGTC CGGGGGTGAA AGGAGAAACC ATTAGCACCA CCGGCGGACA GGTGGATATA
TTCCCGATGA TTGCCAATAT GATGGGATTT GAAAACTATT ATGCGTTGGG CAAAGACCTG
CTGAACACCG AAAAAGGTTA TGCGGTGCTG AGAAACGGTT CAGTGCTCAC GGATGACTAT
TACTACTGCA GTGAGGATGA TACCGTTTAT GATTTGAGAA GCGGTGAGGT TCTTGACAAG
AAGGACTATG AAGATGAGAT ACAAAAATAT CAAAAAGAAC TTCAAATATC CGACATAATT
CTGGAAAAAG ATGCACTGCG GAAGTTGAAA TAA
 
Protein sequence
MKISIERARS NIFTNTRPQL DVFGIIYTVL FVCSIVFKGV FLQFQNQINF KPLFSTTNIF 
MFVASMSFTL VLAALLTVFH TKRRVLFFIS NILMSVLLLS DALYLRYYNT IITIPVIYNA
RYLGPVRESI MSLFRFSDIF YFLDIPVFAV MSFIFSKRAE QNKLPLLKRC VVAAVLMVVA
FGSFKIAYSK NDMSEYDNNY IVKNFGIGYF HYYDVKKYLK ENYLRDKKLR TEEKNELTSF
FEEKNKEKAA LSNRFKGIAK GKNLIIVQME ALQHFVINSK MNGREITPNL NKLVKESLYF
DNIYVQVAGG NTSDAEFMTN TSLYPAKEGA AYFRFATNEY NTIPKELKKE GYNSYALHAY
GPAFWNRTEM YKAIGFDTFI SSNDYVMDEY IGWGGWALSD DSFFRQSLEK IDVTKPFYSF
FITLSGHHPY SYFEDKQTFD VGKYDRTYFG NYIKAQNYAD AALGRFIERL KEMGLYENSL
IVIYGDHTGL PKTQAKELLE FLGVDDNKVD WIKLQKIPLL IHCPGVKGET ISTTGGQVDI
FPMIANMMGF ENYYALGKDL LNTEKGYAVL RNGSVLTDDY YYCSEDDTVY DLRSGEVLDK
KDYEDEIQKY QKELQISDII LEKDALRKLK