Gene EcSMS35_2019 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagEcSMS35_2019 
SymbolnagZ 
ID6143482 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameEscherichia coli SMS-3-5 
KingdomBacteria 
Replicon accessionNC_010498 
Strand
Start bp2041407 
End bp2042432 
Gene Length1026 bp 
Protein Length341 aa 
Translation table11 
GC content53% 
IMG OID641616895 
Productbeta-hexosaminidase 
Protein accessionYP_001744071 
Protein GI170683662 
COG category[G] Carbohydrate transport and metabolism 
COG ID[COG1472] Beta-glucosidase-related glycosidases 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clones19 
Plasmid unclonability p-value0.760849 
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones23 
Fosmid unclonability p-value0.0000464707 
Fosmid HitchhikerYes 
Fosmid clonabilityhitchhiker 
 

Sequence

Gene sequence
GTGGGTCCAG TAATGTTGGA TGTCGAAGGT TACGAACTGG ACGCGGAAGA GCGTGAAATA 
CTGGCGCATC CGCTGGTGGG AGGGCTGATT CTCTTTACGC GTAACTATCA TGATCCTGCC
CAGTTACGTG AACTGGTGCG CCAGATCCGC GCAGCTTCGC GCAATCATCT GGTGGTGGCG
GTTGATCAGG AAGGTGGACG CGTGCAGCGT TTTCGTGAAG GTTTTACCCG CTTGCCAGCG
GCACAATCAT TTGCTGCGCT GTTGGGAATG GAAGAAGGCG GAAAACTGGC GCAAGAGGCA
GGTTGGTTGA TGGCCAGCGA AATGATTGCT ATGGATATTG ATATCAGCTT TGCGCCAGTG
CTGGACGTCG GGCATATCAG CGCGGCGATT GGCGAGCGTT CTTATCATGC CGATCCCGAA
AAAGCCCTGG CAATTGCCAG CCGGTTTATT GATGGTATGC ATGAAGCCGG AATGAAAACT
ACCGGGAAAC ACTTCCCAGG TCACGGTGCA GTAACGGCAG ACTCACACAA AGAAACACCG
TGCGACCCAC GTCCGCAAGC GGAGATTCGC GCTAAAGATA TGTCGGTCTT CAGTTCCTTA
ATCCGCGAAA ATAAACTCGA CGCCATTATG CCTGCGCATG TGATCTACAG TGATGTTGAT
CCGCGTCCGG CGAGCGGTTC TCCCTACTGG CTGAAAACCG TTTTGCGTCA GGAATTGGGT
TTTGACGGCG TGATTTTCTC TGACGATTTA TCGATGGAAG GCGCAGCGAT TATGGGCAGT
TATGCCGAGC GTGGACAGGC ATCACTGGAT GCAGGTTGCG ATATGATCCT GGTCTGCAAT
AATCGTAAAG GGGCCGTCAG CGTGTTAGAT AATCTGTCAC CGATCAAGGC AGAACGTGTT
ACACGTTTGT ATCATAAAGG TTCATTTTCG CGACAGGAAC TGATGGACTC GGCTCGCTGG
AAAGCGATCA GCGCCCGTCT GAATCAGTTA CATGAACGCT GGCAGGAAGA GAAGGCAGGT
CATTAA
 
Protein sequence
MGPVMLDVEG YELDAEEREI LAHPLVGGLI LFTRNYHDPA QLRELVRQIR AASRNHLVVA 
VDQEGGRVQR FREGFTRLPA AQSFAALLGM EEGGKLAQEA GWLMASEMIA MDIDISFAPV
LDVGHISAAI GERSYHADPE KALAIASRFI DGMHEAGMKT TGKHFPGHGA VTADSHKETP
CDPRPQAEIR AKDMSVFSSL IRENKLDAIM PAHVIYSDVD PRPASGSPYW LKTVLRQELG
FDGVIFSDDL SMEGAAIMGS YAERGQASLD AGCDMILVCN NRKGAVSVLD NLSPIKAERV
TRLYHKGSFS RQELMDSARW KAISARLNQL HERWQEEKAG H