Gene EcSMS35_0375 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagEcSMS35_0375 
SymbollacZ 
ID6144942 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameEscherichia coli SMS-3-5 
KingdomBacteria 
Replicon accessionNC_010498 
Strand
Start bp387449 
End bp390523 
Gene Length3075 bp 
Protein Length1024 aa 
Translation table11 
GC content56% 
IMG OID641615271 
Productbeta-D-galactosidase 
Protein accessionYP_001742478 
Protein GI170681400 
COG category[G] Carbohydrate transport and metabolism 
COG ID[COG3250] Beta-galactosidase/beta-glucuronidase 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clones
Plasmid unclonability p-value0.00000167859 
Plasmid hitchhikingNo 
Plasmid clonabilityunclonable 
 

Fosmid Coverage information

Num covering fosmid clones49 
Fosmid unclonability p-value
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGACTATGA TTACGGATTC ACTGGCCGTC GTATTACAAC GTCGTGACTG GGAAAACCCT 
GGCGTTACCC AACTTAATCG CCTTGCGGCA CATCCCCATT TCGCCAGCTG GCGAAATAGC
GAAGAGGCCC GCACCGATCG CCCTTCCCAA CAGTTGCGCA GCCTGAATGG CGAATGGCGC
TTTGCCTGGT TTCCGGCACC AGAAGCGGTG CCGGAAAGCT GGCTGGATTG CGATCTTCCT
GACGCCGATA CTGTCGTCGT CCCCTCAAAC TGGCAGATGC ACGGTTACGA TGCGCCCATC
TACACCAACG TGACCTATCC CATTACGGTC AATCCGCCTT TTGTTCCCGC GGAGAATCCG
ACGGGTTGTT ACTCGCTCAC ATTTAATATT GATGAATGCT GGCTACAGAA AGGCCAGACG
CGAATTATTT TTGATGGTGT TAACTCGGCG TTTCATCTGT GGTGCAACGG GCGCTGGGTC
GGTTACGGCC AGGACAGCCG TTTGCCGTCT GAATTTGACC TGAGCGCATT TTTACGCGCC
GGAAAAAACC GCCTCGCGGT GATGGTGCTG CGCTGGAGTG ACGGCAGTTA TCTGGAAGAT
CAGGATATGT GGCGGATGAG CGGCATTTTC CGTGACGTCT CGTTGCTGCA CAAACCGACC
ACACAAATCA GCGATTTCCA TGTTGCCACT CGCTTTAATG ATGATTTCAG CCGCGCGGTA
CTGGAGGCAG AAGTTCAGAT GTGCGGCGAA CTGCGCGATG AGCTGCGGGT GACGGTTTCT
TTGTGGCAGG GTGAAACGCA GGTCGCCAGC GGCACCACGC CTTTCGGCGG TGAAATTATC
GATGAGCGTG GTGGTTATGC CGATCGCGTC ACGCTACGTC TGAACGTCGA AAACCCGGCG
CTGTGGAGCG CCGAAATCCC GAATCTCTAT CGTGCGGTGG TTGAACTGCA CACCGCCGAC
GGCACGCTGA TTGAAGCAGA AGCCTGCGAT GTCGGTTTCC GCGAAGTGCG GATTGAAAAT
GGTCTCCTGC TGCTGAACGG CAAGCCGGTG CTGATTCGCG GCGTTAACCG TCACGAGCAT
CATCCTCTGC ATGGTCAGGT CATGGATGAG CAGACGATGG TGCAGGATAT CCTGCTGATG
AAGCAGAACA ACTTTAACGC CGTGCGCTGT TCGCATTATC CGAATCATCC GCTGTGGTAC
ACGTTGTGCG ACCGCTACGG CCTGTATGTG GTGGATGAAG CCAATATTGA AACCCACGGC
ATGGTGCCAA TGAATCGTCT GACCGATGAT CCGCGCTGGC TACCAGCGAT GAGCGAACGC
GTAACGCGAA TGGTGCAGCG CGATCGTAAT CACCCGAGTG TGATCATCTG GTCGCTGGGG
AATGAATCAG GCCACGGCGC TAATCACGAC GCGCTGTATC GCTGGATCAA ATCTGTCGAT
CCTTCCCGCC CGGTGCAGTA TGAAGGCGGC GGAGCCGACA CCACGGCCAC CGATATTATT
TGCCCGATGT ACGCGCGCGT GGATGAAGAC CAGCCCTTCC CGGCTGTGCC GAAATGGTCC
ATCAAAAAAT GGCTTTCGCT GCCTGGAGAA CTGCGCCCGC TGATCCTTTG CGAATATGCC
CACGCGATGG GTAACAGTCT TGGCGGCTTC GCTAAATACT GGCAGGCATT TCGTCAGTAC
CCCCGTTTAC AGGGCGGCTT CGTCTGGGAC TGGGTGGATC AGTCGCTGAT TAAATATGAT
GAAAACGGCA ACCCGTGGTC GGCTTACGGC GGTGATTTTG GCGATACGCC GAACGATCGC
CAGTTCTGCA TGAACGGTCT GGTATTTGCC GACCGCACGC CGCATCCGGC GCTGACGGAA
GCAAAACACC AGCAGCAGTT TTTCCAGTTC CGTTTATCCG GGCGAACCAT CGAAGTGACC
AGCGAATACC TGTTCCGTCA TAGCGATAAC GAGCTCCTGC ACTGGTCGGT GGCACTGGAT
GGCAAGCCGC TGGCAAGCGG TGAAATGCCT CTGGATGTTG CTCCACAAGA TAAACAGTTG
ATTGAACTGC CTGAACTACC GCAGCCGGAA AGCACCGGAC AACTCTGGCT TACGGTACAC
GTAGTGCAAC CGAACGCGAC CGCATGGTCA GAAGCCGGAC ACATTAGCGC CTGGCAGCAG
TGGCGTCTGG CGGAAAACCT CAGTGTGGCA CTCCCCTCCG CGCCCCACGC CATCCCGCAA
CTGACCACCA GCGAAATGGA TTTTTGCATC GAGCTGGGTA ATAAGCGTTG GCAATTTAAC
CGCCAGTCAG GCTTTCTTTC ACAGATGTGG ATTGGCGATG AAAAACAACT GCTGACGCCG
CTGCGCGATC AGTTCATCCG CGCACCGCTG GATAACGACA TTGGCGTAAG TGAAGCGACC
CGCATTGACC CTAACGCCTG GGTCGAACGC TGGAAGGCGG CGGGCCATTA CCAGGCCGAA
GTGGCGTTGT TGCAGTGCAC GGCAGATATA CTTGCCGACG CGGTGCTGAT TACGACCGCT
CACGCGTGGC AGCATCAGGG GAAAACCTTA TTTATCAGCC GGAAAACCTA CCGGATTGAT
GGTAGTGGTC AAATGGCGAT TACCGTTGAT GTTGAAGTGG CGAGCGATAC ACCGCATCCG
GCGCGGATTG GCCTGACCTG CCAGCTGGCG CAGGTCGCAG AGCGGGTAAA CTGGCTCGGA
TTAGGGCCGC AAGAAAACTA TCCCGACCGC CTTACTGCGG CCTGTTTTGA CCGCTGGGAT
CTGCCATTGT CAGACATGTA TACCCCTTAC GTCTTCCCGA GCGAAAACGG TCTGCGCTGC
GGGACGCGCG AATTGAATTA TGGCCCACAC CAGTGGCGCG GCGACTTCCA GTTCAATATC
AGTCGCTACA GCCAACAACA ACTGATGGAA ACCAGCCATC GCCATCTGCT GCACGCGGAA
GAAGGCACAT GGCTGAATAT CGACGGTTTC CATATGGGGA TTGGTGGCGA CGACTCCTGG
AGCCCGTCAG TGTCGGCGGA ATTCCAGCTG AGCGCCGGTC GCTACCATTA CCAGTTGGTC
TGGTGTCAAA AATAA
 
Protein sequence
MTMITDSLAV VLQRRDWENP GVTQLNRLAA HPHFASWRNS EEARTDRPSQ QLRSLNGEWR 
FAWFPAPEAV PESWLDCDLP DADTVVVPSN WQMHGYDAPI YTNVTYPITV NPPFVPAENP
TGCYSLTFNI DECWLQKGQT RIIFDGVNSA FHLWCNGRWV GYGQDSRLPS EFDLSAFLRA
GKNRLAVMVL RWSDGSYLED QDMWRMSGIF RDVSLLHKPT TQISDFHVAT RFNDDFSRAV
LEAEVQMCGE LRDELRVTVS LWQGETQVAS GTTPFGGEII DERGGYADRV TLRLNVENPA
LWSAEIPNLY RAVVELHTAD GTLIEAEACD VGFREVRIEN GLLLLNGKPV LIRGVNRHEH
HPLHGQVMDE QTMVQDILLM KQNNFNAVRC SHYPNHPLWY TLCDRYGLYV VDEANIETHG
MVPMNRLTDD PRWLPAMSER VTRMVQRDRN HPSVIIWSLG NESGHGANHD ALYRWIKSVD
PSRPVQYEGG GADTTATDII CPMYARVDED QPFPAVPKWS IKKWLSLPGE LRPLILCEYA
HAMGNSLGGF AKYWQAFRQY PRLQGGFVWD WVDQSLIKYD ENGNPWSAYG GDFGDTPNDR
QFCMNGLVFA DRTPHPALTE AKHQQQFFQF RLSGRTIEVT SEYLFRHSDN ELLHWSVALD
GKPLASGEMP LDVAPQDKQL IELPELPQPE STGQLWLTVH VVQPNATAWS EAGHISAWQQ
WRLAENLSVA LPSAPHAIPQ LTTSEMDFCI ELGNKRWQFN RQSGFLSQMW IGDEKQLLTP
LRDQFIRAPL DNDIGVSEAT RIDPNAWVER WKAAGHYQAE VALLQCTADI LADAVLITTA
HAWQHQGKTL FISRKTYRID GSGQMAITVD VEVASDTPHP ARIGLTCQLA QVAERVNWLG
LGPQENYPDR LTAACFDRWD LPLSDMYTPY VFPSENGLRC GTRELNYGPH QWRGDFQFNI
SRYSQQQLME TSHRHLLHAE EGTWLNIDGF HMGIGGDDSW SPSVSAEFQL SAGRYHYQLV
WCQK