Gene EcSMS35_1089 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagEcSMS35_1089 
Symbol 
ID6146743 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameEscherichia coli SMS-3-5 
KingdomBacteria 
Replicon accessionNC_010498 
Strand
Start bp1104532 
End bp1105830 
Gene Length1299 bp 
Protein Length432 aa 
Translation table11 
GC content46% 
IMG OID641615974 
Productanaerobic C4-dicarboxylate transporter 
Protein accessionYP_001743166 
Protein GI170683393 
COG category[R] General function prediction only 
COG ID[COG2704] Anaerobic C4-dicarboxylate transporter 
TIGRFAM ID[TIGR00770] anaerobic c4-dicarboxylate membrane transporter family protein 


Plasmid Coverage information

Num covering plasmid clones27 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones51 
Fosmid unclonability p-value
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGGTTTGGC TGGAAATTAT CGTTGTACTT GGCGCAATAT TTTTTGGTAT TCGTCAGGGA 
GGAATCGGCA TTGGTTTATG CGGCGGTCTT GGCCTTGCGA TCCTGACTCT GGGATTTGGT
CTGCCTATGG GATCACCGCC AGTCGATGTG ATCCTGATTA TTATGACCGT AGTTGTGGCA
GCTTCAGCCT TACAGGCGGC AGGTGGGATG GATTATCTGG TACGTTTAGC CAGTAATTTC
ATGCGACGTA ACCCAAAATA TATCAACATA ATCGCACCGA TTATCACTTG GTTAATGACC
ATTATGGCCG GTACCGGGTT TATTGTTTTC TCCACACTGC CGGTAATTGC TGAAGTAGCA
AAAGAGTCCG GTATCCGGCC TTCCCGCACA CTCGCTGGCT CAGTTGTCGC CTCACAGGTT
GCAATTTCTG GCTCACCAAT CAGTGCCGCA ATGGCAGCTA TGCTCACTAT TATGGAGGTT
AATGGCGTCA GTTTTATTCA GGTGATGTCG GTATGCCTTC CCACCTCTTT TGTCGCCGCG
ATGGTTGCTG CTTTTATCGC CTCCCGACAG GGGTGTGAAT TACAAGATGA CGAAGTATAT
CTTGAACGTC TGCAAAAAGG TCTGGTGCAA AAATACGAGA ATAATAACAG CATCAAACCT
GGTGCAACGC TTTCCGTCGG ATTATTCATG CTGGCGACGA TCGCCATTGT TATTCTTGCT
GCATTCCCCC AATTGCGTCC GGGTTTTGAT ATCAGCAAAC CGATGGAAAC GCGCGATATT
ATTATTATTT GCATGCTCTC TGCTGCCTGC CTGATGGTAA TACTGTGTAA GATGTCTACT
GATGACATTA TTCTGACCTC CACATTTCGT GCTGGAATGA GTTCGCTGGC AGTGATTCTT
GGTATTGTAA CCTTAGGTAC CACCTTTATT GATGCGCATT TAACTGAGAT CAAAGATATC
GCTGGTGATA TTTTACAAAC TTATCCGATG CTTTTAGCAT TAGTTCTTTT TTTTACCTGC
GCTTTACTTT ACTCACAGGG AGCGACTACG CCGTTAATTA TTCCTCTGGC TGTAGCACTC
AACATACCAA CCTGGGCAAT TCTTGCATCT TATGTTGCCG TTACGGGAGT ATTCGTGCTT
CCAACGTATC CAACATCACT CGCCGCGATG GAATTTGATA CTACCGGCAC AACCCGTGTG
GGCAATTATT TATTAAATCA CCCATTTATG CTACCCGGGT TGGGTGGTGT TATTGCAGGA
GTGACATTTG GGTTTGTGAT TGCACCAATG ATGGTCTGA
 
Protein sequence
MVWLEIIVVL GAIFFGIRQG GIGIGLCGGL GLAILTLGFG LPMGSPPVDV ILIIMTVVVA 
ASALQAAGGM DYLVRLASNF MRRNPKYINI IAPIITWLMT IMAGTGFIVF STLPVIAEVA
KESGIRPSRT LAGSVVASQV AISGSPISAA MAAMLTIMEV NGVSFIQVMS VCLPTSFVAA
MVAAFIASRQ GCELQDDEVY LERLQKGLVQ KYENNNSIKP GATLSVGLFM LATIAIVILA
AFPQLRPGFD ISKPMETRDI IIICMLSAAC LMVILCKMST DDIILTSTFR AGMSSLAVIL
GIVTLGTTFI DAHLTEIKDI AGDILQTYPM LLALVLFFTC ALLYSQGATT PLIIPLAVAL
NIPTWAILAS YVAVTGVFVL PTYPTSLAAM EFDTTGTTRV GNYLLNHPFM LPGLGGVIAG
VTFGFVIAPM MV