Gene EcSMS35_2084 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagEcSMS35_2084 
SymbolmdoC 
ID6142973 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameEscherichia coli SMS-3-5 
KingdomBacteria 
Replicon accessionNC_010498 
Strand
Start bp2098356 
End bp2099513 
Gene Length1158 bp 
Protein Length385 aa 
Translation table11 
GC content44% 
IMG OID641616960 
Productglucans biosynthesis protein 
Protein accessionYP_001744136 
Protein GI170684209 
COG category 
COG ID 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clones18 
Plasmid unclonability p-value0.72315 
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones48 
Fosmid unclonability p-value0.86966 
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGAACCCAG TACCCGCGCA ACGTGAATAT TTCCTCGACT CCATCCGCGC CTGGCTGATG 
TTGTTAGGGA TCCCTTTTCA TATTTCTTTA ATCTATTCGA GCCATACATG GCATGTGAAT
AGCGCCGAAC CGTCATTATG GCTGACCCTT TTTAATGACT TCATCCACTC GTTCCGCATG
CAGGTATTTT TCGTTATATC CGGGTACTTT TCCTACATGC TTTTTTTACG CTATCCCTTG
AAAAAATGGT GGAAAGTACG TGTCAAACGT GTAGGGATCC CGATGTTAAC AGCCATCCCC
CTACTGACAT TGCCGCAATT TATTATGCTG CAATACGTCA AAGGGAAAGC GGAAAGTTGG
CCTGGACTGT CATTGTATGA CAAATATAAT ACGTTGGCCT GGGAATTAAT ATCACACCTG
TGGTTTTTAC TGGTGTTAGT AGTCATGACG ACGCTGTGCG TATGGATATT TAAACGCATC
AGAAATAATT TAGAAAATTC TGATAAAACG AATAAAAAAT TCTCGATGGT AAAACTATCG
GTGATTTTTT TGTGCCTCGG CATCGGTTAT GCGGTAATAA GAAGAACGAT TTTTATTGTG
TATCCGCCCA TTCTGAGTAA CGGCATGTTC AATTTTATTG TCATGCAAAC GCTATTTTAT
TTGCCGTTCT TTATCCTCGG CGCACTGGCT TTCATTTTCC CTCATCTTAA AGCCTTGTTT
ACCACGCCGT CTCGTGGCTG TACCCTTGCA GCAGCATTGG CATTTGTCGC TTACTTACTC
AACCAGCGCT ATGGCAGTGG CGATGCCTGG ATGTACGAAA CCGAGTCTGT GATCACCATG
GTCCTCGGTC TGTGGATGGT GAATGTGGTC TTCTCCTTCG GCCACCGTTT GCTTAACTTC
CAGTCAGCGC GGGTGACTTA CTTTGTTAAT GCATCGCTGT TTATCTATCT GGTTCACCAC
CCGTTAACGC TGTTTTTCGG CGCGTACATT ACACCGCACA TCACCTCCAA CTGGCTTGGT
TTTCTCTGTG GCCTGATATT CGTAGTAGGG ATTGCGATAA TTCTGTATGA AATTCATTTA
CGCATCCCGT TACTGAAGTT TTTGTTCTCT GGTAAACCGG TTGTTAAGCG TGAGAACGAT
AAAGCACCAG CCCGTTAA
 
Protein sequence
MNPVPAQREY FLDSIRAWLM LLGIPFHISL IYSSHTWHVN SAEPSLWLTL FNDFIHSFRM 
QVFFVISGYF SYMLFLRYPL KKWWKVRVKR VGIPMLTAIP LLTLPQFIML QYVKGKAESW
PGLSLYDKYN TLAWELISHL WFLLVLVVMT TLCVWIFKRI RNNLENSDKT NKKFSMVKLS
VIFLCLGIGY AVIRRTIFIV YPPILSNGMF NFIVMQTLFY LPFFILGALA FIFPHLKALF
TTPSRGCTLA AALAFVAYLL NQRYGSGDAW MYETESVITM VLGLWMVNVV FSFGHRLLNF
QSARVTYFVN ASLFIYLVHH PLTLFFGAYI TPHITSNWLG FLCGLIFVVG IAIILYEIHL
RIPLLKFLFS GKPVVKREND KAPAR