Gene EcSMS35_3650 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagEcSMS35_3650 
SymbolcysG 
ID6143292 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameEscherichia coli SMS-3-5 
KingdomBacteria 
Replicon accessionNC_010498 
Strand
Start bp3709956 
End bp3711329 
Gene Length1374 bp 
Protein Length457 aa 
Translation table11 
GC content54% 
IMG OID641618477 
Productsiroheme synthase 
Protein accessionYP_001745617 
Protein GI170680692 
COG category[H] Coenzyme transport and metabolism 
COG ID[COG0007] Uroporphyrinogen-III methylase
[COG1648] Siroheme synthase (precorrin-2 oxidase/ferrochelatase domain) 
TIGRFAM ID[TIGR01469] uroporphyrin-III C-methyltransferase
[TIGR01470] siroheme synthase, N-terminal domain 


Plasmid Coverage information

Num covering plasmid clones20 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones23 
Fosmid unclonability p-value0.0000355794 
Fosmid HitchhikerYes 
Fosmid clonabilityhitchhiker 
 

Sequence

Gene sequence
GTGGATCATT TGCCTATATT TTGCCAATTA CGCGATCGCG ACTGTCTGAT TGTCGGCGGT 
GGTGATGTCG CGGAACGCAA AGCAAGGTTG CTGTTAGACG CAGGCGCTCG CTTAACGGTG
AATGCATTAG CGTTTATTCC ACAGTTCACC GCATGGGCAG ATGCAGGCAT GTTAACCCTC
GTCGAAGGGC CATTTGATGA AAGCCTTCTC GACACCTGCT GGCTGGCGAT TGCAGCGACA
GATGATGACG CGCTTAACCA GCGCGTCAGT GAAGCCGCTG AAGCTCGTCG CATCTTCTGT
AACGTGGTCG ATGCGCCGAA AGCCGCCAGC TTTATTATGC CGTCAATTAT TGACCGCTCA
CCGCTCATGG TCGCGGTCTC CTCTGGCGGC ACCTCTCCGG TTCTGGCACG CCTGTTGCGT
GAAAAACTTG AATCACTGCT GCCGTTGCAT CTGGGCCAGG TAGCGAAATA CGCCGGGCAA
TTACGCGGTC GAGTGAAACA ACAGTTCGCC ACAATGGGTG AACGTCGCCG TTTCTGGGAA
AAACTGTTCG TTAATGACCG TCTGGCGCAG TCGCTGGCAA ACAACGATCA GAAAGCCATT
ACTGAAACGA CCGAACAATT AATCAACGAA CCGCTCGACC ATCGCGGTGA AGTGGTGCTG
GTTGGCGCAG GTCCGGGCGA TGCCGGGCTG CTGACACTGA AAGGACTGCA ACAAATTCAG
CAGGCAGATG TGGTGGTCTA CGACCGTCTG GTTTCTGACG ATATTATGAA TCTGGTACGC
CGCGATGCTG ATCGCGTTTT CGTCGGCAAA CGCGCGGGAT ACCACTGCGT ACCGCAGGAA
GAGATTAACC AGATCCTGCT GCGGGAAGCG CAAAAAGGCA AACGCGTGGT GCGACTGAAA
GGCGGCGATC CGTTTATTTT TGGCCGTGGT GGCGAAGAGC TGGAAACACT GTGCAATGCA
GGCATTCCGT TCTCGGTGGT TCCGGGTATT ACCGCAGCTT CTGGTTGCTC TGCCTATTCG
GGTATTCCGC TCACGCATCG CGATTATGCC CAGAGCGTAC GCTTAATTAC CGGACACTTA
AAAACCGGTG GCGAACTGGA CTGGGAAAAC CTGGCGGCAG AAAAACAGAC GCTGGTGTTC
TATATGGGGT TGAATCAGGC CGCGACTATT CAGCAAAAGC TGATTGAACA CGGTATGCCT
GGCGAAATGC CGGTGGCAAT TGTCGAAAAC GGAACGGCAG TCACGCAGCG CGTGATTGAC
GGTACGCTCA CGCAGCTGGG CGAACTGGCG CAGCAAATGA ACAGTCCATC GCTAATTATT
ATTGGTCGGG TTGTTGGCCT GCGCGATAAA TTGAACTGGT TCTCTAACCA TTAA
 
Protein sequence
MDHLPIFCQL RDRDCLIVGG GDVAERKARL LLDAGARLTV NALAFIPQFT AWADAGMLTL 
VEGPFDESLL DTCWLAIAAT DDDALNQRVS EAAEARRIFC NVVDAPKAAS FIMPSIIDRS
PLMVAVSSGG TSPVLARLLR EKLESLLPLH LGQVAKYAGQ LRGRVKQQFA TMGERRRFWE
KLFVNDRLAQ SLANNDQKAI TETTEQLINE PLDHRGEVVL VGAGPGDAGL LTLKGLQQIQ
QADVVVYDRL VSDDIMNLVR RDADRVFVGK RAGYHCVPQE EINQILLREA QKGKRVVRLK
GGDPFIFGRG GEELETLCNA GIPFSVVPGI TAASGCSAYS GIPLTHRDYA QSVRLITGHL
KTGGELDWEN LAAEKQTLVF YMGLNQAATI QQKLIEHGMP GEMPVAIVEN GTAVTQRVID
GTLTQLGELA QQMNSPSLII IGRVVGLRDK LNWFSNH