Gene EcSMS35_1018 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagEcSMS35_1018 
SymbolwcaL 
ID6142712 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameEscherichia coli SMS-3-5 
KingdomBacteria 
Replicon accessionNC_010498 
Strand
Start bp1038233 
End bp1039453 
Gene Length1221 bp 
Protein Length406 aa 
Translation table11 
GC content55% 
IMG OID641615905 
Productcolanic acid biosynthesis glycosyl transferase WcaL 
Protein accessionYP_001743097 
Protein GI170682752 
COG category[M] Cell wall/membrane/envelope biogenesis 
COG ID[COG0438] Glycosyltransferase 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clones21 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones58 
Fosmid unclonability p-value
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGAAGGTCG GCTTCTTTTT ACTGAAATTT CCGCTGTCGT CGGAAACCTT CGTCCTCAAT 
CAAATTACCG CGTTTATTGA TATGGGCTTT GAGGTGGAGA TTGTCGCGCT GCAAAAAGGC
GACACCCAGA ACACCCACGC GGCATGGACG AAATACAACC TTGCTGCCAG AACCCGCTGG
TTACAGGACG AACCTACGGG CAAAGTGGCG AAACTGCGCC ACCGCGCCAG CCAGACGTTA
CGCGGCATTC ATCGTAAAAA TACCTGGCAG GCGCTCAACC TCAAACGCTA TGGCGCCGAG
TCGCGGAACC TGATTTTGTC TGCCATTTGT GGTCAGGTCG CAACACCGTT TCGCGCCGAT
GTGTTCATCG CTCATTTTGG CCCTGCGGGG GTAACCGCAG CAAAACTACG CGAACTGGGT
GTCATTCGCG GCAAAATTGC CACTATCTTC CACGGTATTG ATATCTCCAG TCGGGACGTG
CTCAACCACT ACACTCCCGA ATATCAACAA CTGTTTCGCC GTGGCGACCT GATGTTACCG
ATAAGCGATT TGTGGGCCGG AAGGCTGCAA AAAATGGGCT GTCCGAGGGA AAAAATCGCC
GTATCGCGTA TGGGCGTGGA CATGACGCGC TTTAGCCCGC GTCCGGTGAA AGCGCCCGCA
ACGCCGCTGG AAATCATCTC CGTCGCACGC TTAACCGAGA AAAAAGGCCT GCATGTGGCG
ATCGAAGCCT GCCGGCAGTT GAAAGAGCAG GGCGTGGCAT TTCGCTATCG CATCCTCGGC
ATTGGCCCGT GGGAACGACG CCTGCGCACG CTCATCGAAC AATATCAACT GGAAGATGTG
GTAGAGATGC CGGGCTTTAA ACCGAGCCAT GAAGTGAAAG CGATGCTCGA CGACGCGGAT
GTCTTCCTGT TGCCATCGAT AACGGGTGCG GATGGCGATA TGGAAGGTAT TCCGGTGGCG
CTAATGGAAG CGATGGCGGT CGGCATTCCG GTTGTTTCAA CTCTGCATAG CGGAATACCA
GAACTGGTGG AGGCCGATAA ATCCGGCTGG CTGGTGCCTG AGAACGATGC TCGCGCACTG
GCGCAACGAC TGGCGGCGTT TAGCCAACTG AACACCGACG AACTGGCTCC GGTCGTCAAA
CGTGCGCGCG AAAAAGTCGA ACACGATTTT AACCAGCAGG TGATTAATCG AGAACTCGCC
AGCTTGCTGC AGGCTTTATA G
 
Protein sequence
MKVGFFLLKF PLSSETFVLN QITAFIDMGF EVEIVALQKG DTQNTHAAWT KYNLAARTRW 
LQDEPTGKVA KLRHRASQTL RGIHRKNTWQ ALNLKRYGAE SRNLILSAIC GQVATPFRAD
VFIAHFGPAG VTAAKLRELG VIRGKIATIF HGIDISSRDV LNHYTPEYQQ LFRRGDLMLP
ISDLWAGRLQ KMGCPREKIA VSRMGVDMTR FSPRPVKAPA TPLEIISVAR LTEKKGLHVA
IEACRQLKEQ GVAFRYRILG IGPWERRLRT LIEQYQLEDV VEMPGFKPSH EVKAMLDDAD
VFLLPSITGA DGDMEGIPVA LMEAMAVGIP VVSTLHSGIP ELVEADKSGW LVPENDARAL
AQRLAAFSQL NTDELAPVVK RAREKVEHDF NQQVINRELA SLLQAL