Gene EcSMS35_3230 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagEcSMS35_3230 
SymbolneuC 
ID6147173 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameEscherichia coli SMS-3-5 
KingdomBacteria 
Replicon accessionNC_010498 
Strand
Start bp3302196 
End bp3303371 
Gene Length1176 bp 
Protein Length391 aa 
Translation table11 
GC content32% 
IMG OID641618060 
Productpolysialic acid capsule biosynthesis protein NeuC 
Protein accessionYP_001745210 
Protein GI170683594 
COG category[M] Cell wall/membrane/envelope biogenesis 
COG ID[COG0381] UDP-N-acetylglucosamine 2-epimerase 
TIGRFAM ID[TIGR03568] UDP-N-acetyl-D-glucosamine 2-epimerase, UDP-hydrolysing 


Plasmid Coverage information

Num covering plasmid clones30 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones51 
Fosmid unclonability p-value
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGAAAAAAA TATTATACGT AACTGGATCT AGAGCTGAAT ATGGAATAGT TCGGAGACTT 
TTGACAATGC TAAGAGAAAC TCCAGAAATA CAGCTTGATT TGGCAGTTAC AGGAATGCAT
TGTGATAATG CGTATGGAAA TACAATACAT ATTATAGAAC AAGATAATTT TAATATTATC
AAGGTTGTGG ATATAAATAT CAATACAACT TCACATACTC ACATTCTCCA TTCAATGAGT
GTTTGCCTCA ATTCGTTTGG TGATTTTTTT TCAAATAACA CATATGATGC GGTTATGGTT
TTAGGCGATA GATATGAAAT ATTTTCAGTC GCTATCGCAG CATCAATGCA TAATATTCCA
TTAATTCATA TTCATGGTGG TGAAAAGACA TTAGCTAATT ATGATGAGTT TATTAGGCAT
TCAATTACTA AAATGAGTAA ACTCCATCTT ACTTCTACAG AAGAGTATAA AAAACGAGTA
ATTCAACTAG GTGAAAAGCC TGGTAGTGTG TTTAATATTG GTTCTCTTGG TGCAGAAAAT
GCTCTTTCAT TGCATTTACC AAATAAGCAG GAGTTGGAAC TAAAATATGG TTCACTGTTA
AAACGGTACT TTGTTGTAGT ATTCCATCCT GAAACACTTT CCACGCAGTC GGTTAATGAT
CAAATAGATG AGTTATTGTC AGCGATTTCT TTTTTTAAAA ATACTCACGA CTTTATTTTT
ATTGGCAGTA ACGCTGACAC TGGTTCTGAT ATAATTCAGA GAAAAGTAAA ATATTTTTGC
AAAGAGTATA AGTTCAGATA TTTGATTTCT ATTCGTTCAG AAGATTATTT GGCAATGATT
AAATGCTCTT GTGGGCTAAT TGGGAACTCC TCCTCTGGTT TAATTGAGGT TCCATCTTTA
AAAGTTGCAA CAATTAACAT TGGTGATAGG CAGAAAGGCC GTGTTCGTGG AGCCAGTGTA
ATAGATGTAC CCGTTGAAAA AAATGCAATC GTCAGAGGGA TAAATATATC TCAAGATGAA
AAATTTATTA GTGTTGTACA GTCATCTAGT AATCCTTATT TTAAAGAAAA TGCTTTAATT
AATGCTGTTA GAATTATTAA GGATTTTATT AAATCAAAAA ATAAAGATTA CAAAGATTTT
TATGACATCC CGGAATGTAC CACCAGTTAT GACTAG
 
Protein sequence
MKKILYVTGS RAEYGIVRRL LTMLRETPEI QLDLAVTGMH CDNAYGNTIH IIEQDNFNII 
KVVDININTT SHTHILHSMS VCLNSFGDFF SNNTYDAVMV LGDRYEIFSV AIAASMHNIP
LIHIHGGEKT LANYDEFIRH SITKMSKLHL TSTEEYKKRV IQLGEKPGSV FNIGSLGAEN
ALSLHLPNKQ ELELKYGSLL KRYFVVVFHP ETLSTQSVND QIDELLSAIS FFKNTHDFIF
IGSNADTGSD IIQRKVKYFC KEYKFRYLIS IRSEDYLAMI KCSCGLIGNS SSGLIEVPSL
KVATINIGDR QKGRVRGASV IDVPVEKNAI VRGINISQDE KFISVVQSSS NPYFKENALI
NAVRIIKDFI KSKNKDYKDF YDIPECTTSY D