Gene EcSMS35_4589 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagEcSMS35_4589 
SymboldcuB 
ID6145910 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameEscherichia coli SMS-3-5 
KingdomBacteria 
Replicon accessionNC_010498 
Strand
Start bp4692368 
End bp4693708 
Gene Length1341 bp 
Protein Length446 aa 
Translation table11 
GC content52% 
IMG OID641619405 
Productanaerobic C4-dicarboxylate transporter 
Protein accessionYP_001746517 
Protein GI170682748 
COG category[R] General function prediction only 
COG ID[COG2704] Anaerobic C4-dicarboxylate transporter 
TIGRFAM ID[TIGR00148] UbiD family decarboxylases
[TIGR00770] anaerobic c4-dicarboxylate membrane transporter family protein 


Plasmid Coverage information

Num covering plasmid clones26 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones58 
Fosmid unclonability p-value
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGTTATTTA CTATCCAACT TATCATAATA CTGATATGTC TGTTTTATGG TGCCAGAAAG 
GGGGGTATCG CGCTGGGTTT ATTAGGCGGT ATCGGTCTGG TCATTCTGGT CTTCGTCTTC
CACCTTCAGC CAGGTAAACC ACCGGTTGAT GTCATGCTGG TTATCATTGC GGTGGTGGCA
GCATCGGCGA CCTTGCAAGC TTCGGGCGGT CTTGATGTCA TGCTGCAAAT TGCCGAGAAG
CTGCTGCGCC GCAACCCGAA ATATGTCTCA ATTGTCGCGC CGTTTGTGAC CTGTACGCTG
ACCATTCTTT GCGGTACGGG TCATGTGGTT TACACCATTC TGCCGATCAT CTACGACGTT
GCTATTAAGA ACAACATCCG TCCGGAACGT CCGATGGCGG CAAGTTCTAT CGGTGCACAG
ATGGGGATTA TCGCCAGTCC GGTGTCGGTT GCGGTCGTAT CTTTGGTTGC GATGCTGGGT
AATGTCACCT TTGATGGTCG CCATCTTGAG TTCCTCGACC TGCTGGCAAT CACCATTCCA
TCGACGTTAA TCGGTATCCT GGCGATCGGT ATCTTCAGTT GGTTCCGCGG TAAAGATCTG
GATAAAGACG AAGAGTTCCA GAAATTCATC TCCGTACCGG AAAACCGTGA GTATGTTTAC
GGTGATACCG CGACGCTGCT CGATAAAAAA CTGCCGAAAA GCAACTGGCT GGCAATGTGG
ATCTTCCTCG GGGCAATCGC TGTAGTCGCC CTTCTTGGTG CTGATTCGGA CCTGCGTCCA
TCCTTCGGCG GCAAACCGCT GTCGATGGTA CTGGTTATTC AGATGTTTAT GCTGCTGACC
GGGGCGCTGA TTATTATCCT GACCAAAACC AATCCCGCGT CTATCTCAAA AAACGAAGTC
TTCCGTTCCG GTATGATCGC CATCGTGGCG GTGTACGGTA TCGCATGGAT GGCTGAAACC
ATGTTCGGTG CGCATATGTC TGAAATTCAG GGCGTACTGG GTGAAATGGT GAAAGAGTAT
CCGTGGGCCT ATGCCATTGT TCTGCTGCTG GTTTCCAAGT TTGTAAACTC TCAGGCTGCG
GCGCTGGCGG CGATTGTTCC GGTCGCGCTG GCGATCGGCG TTGATCCGGC ATACATCGTG
GCTTCAGCAC CGGCTTGCTA CGGTTATTAC ATCCTGCCGA CTTATCCGAG CGATCTGGCA
GCGATTCAGT TTGACCGTTC CGGCACCACC CACATCGGTC GCTTCGTCAT CAACCACAGC
TTTATTCTGC CGGGGTTGAT TGGTGTGAGC GTATCGTGCG TCTTCGGCTG GATCTTCGCC
GCGATGTACG GGTTCTTATA A
 
Protein sequence
MLFTIQLIII LICLFYGARK GGIALGLLGG IGLVILVFVF HLQPGKPPVD VMLVIIAVVA 
ASATLQASGG LDVMLQIAEK LLRRNPKYVS IVAPFVTCTL TILCGTGHVV YTILPIIYDV
AIKNNIRPER PMAASSIGAQ MGIIASPVSV AVVSLVAMLG NVTFDGRHLE FLDLLAITIP
STLIGILAIG IFSWFRGKDL DKDEEFQKFI SVPENREYVY GDTATLLDKK LPKSNWLAMW
IFLGAIAVVA LLGADSDLRP SFGGKPLSMV LVIQMFMLLT GALIIILTKT NPASISKNEV
FRSGMIAIVA VYGIAWMAET MFGAHMSEIQ GVLGEMVKEY PWAYAIVLLL VSKFVNSQAA
ALAAIVPVAL AIGVDPAYIV ASAPACYGYY ILPTYPSDLA AIQFDRSGTT HIGRFVINHS
FILPGLIGVS VSCVFGWIFA AMYGFL