Gene EcSMS35_3801 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagEcSMS35_3801 
SymbolhmuS 
ID6145970 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameEscherichia coli SMS-3-5 
KingdomBacteria 
Replicon accessionNC_010498 
Strand
Start bp3866897 
End bp3867925 
Gene Length1029 bp 
Protein Length342 aa 
Translation table11 
GC content50% 
IMG OID641618627 
Producthemin transport protein HmuS 
Protein accessionYP_001745767 
Protein GI170682449 
COG category[P] Inorganic ion transport and metabolism 
COG ID[COG3720] Putative heme degradation protein 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clones29 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones40 
Fosmid unclonability p-value0.189206 
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGAACCACT ACACACGCTG GCTTGAGTTA AAAGAACAAA ATCCCGGAAA GTACGCGCGT 
GACATCGCAG GTTTAATGAA TATCAGCGAA GCAGAACTGG CATTTGCGCG CGTCACGCAC
GACGCGTGGC GGATGCGCGG CGATATCCGT GACATTCTGG CGGCGCTCGA AAGTGTTGGC
GAGACCAAAT GTATTTGCCG TAATGAATAT GCAGTCCATG AGCAAGTTGG TGCGTTCACA
AACCAGCATT TGAACGGACA TGCCGGATTG ATCCTCAATC CGCGCGCGCT GGATTTACGT
CTGTTTCTCA ATCAATGGGC CAGTGTTTTC CACATCAAAG AAAACACGGC TCGTGGCGAA
CGCCAGAGTA TTCAGTTCTT TGATCATCAG GGCGATGCAT TACTAAAAGT TTATGCCACC
GACAATACCG ATATGGCGGC ATGGAGTGAG CTTCTGGCAC GGTTTATCAC CGATGAGAAT
ACGCCGCTTG AGTTAAAAGC CGTTGATGCG CCAGTTGTTC AAACGCGAGC CGATGCCAGT
GTGGTCGAGC AAGAGTGGCG GGCGATGACC GACGTTCATC AGTTTTTTAC GTTGCTCAAG
CGCCACAACC TGACGCGCCA ACAGGCGTTC AATCTGGTGG CAGACGATTT GGCCTGCAAA
GTATCCAACA GTGCGTTGGC GCAAATTCTT GAATCTGCAC AGCAGGATGG CAATGAAATC
ATGGTGTTTG TTGGCAACCG TGGCTGCGTA CAGATTTTCA CTGGCGTGGT AGAAAAAGTG
GTGCCAATGA AAGGTTGGCT GAATATTTTT AACCCGACGT TTACTCTTCA TCTATTAGAA
GAGAGCATTG CTGAAACCTG GGTAACCCGT AAACCTGCTA GTGACGGTTA TGTGACCAGT
CTGGAATTGT TTGCCCATGA TGGTACGCAG ATAGCGCAAC TTTATGGTCA ACGTACAGAA
GGCGAACAGG AGCAAGCGCA ATGGCGTAAG CAAATTGCTT CGCTGATACC GGAAGGCGTT
ACTGCATAA
 
Protein sequence
MNHYTRWLEL KEQNPGKYAR DIAGLMNISE AELAFARVTH DAWRMRGDIR DILAALESVG 
ETKCICRNEY AVHEQVGAFT NQHLNGHAGL ILNPRALDLR LFLNQWASVF HIKENTARGE
RQSIQFFDHQ GDALLKVYAT DNTDMAAWSE LLARFITDEN TPLELKAVDA PVVQTRADAS
VVEQEWRAMT DVHQFFTLLK RHNLTRQQAF NLVADDLACK VSNSALAQIL ESAQQDGNEI
MVFVGNRGCV QIFTGVVEKV VPMKGWLNIF NPTFTLHLLE ESIAETWVTR KPASDGYVTS
LELFAHDGTQ IAQLYGQRTE GEQEQAQWRK QIASLIPEGV TA