Gene EcSMS35_3845 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagEcSMS35_3845 
Symbol 
ID6142610 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameEscherichia coli SMS-3-5 
KingdomBacteria 
Replicon accessionNC_010498 
Strand
Start bp3921007 
End bp3922566 
Gene Length1560 bp 
Protein Length519 aa 
Translation table11 
GC content49% 
IMG OID641618671 
Producthypothetical protein 
Protein accessionYP_001745811 
Protein GI170682494 
COG category 
COG ID 
TIGRFAM ID[TIGR03369] cellulose biosynthesis protein BcsE 


Plasmid Coverage information

Num covering plasmid clones21 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones47 
Fosmid unclonability p-value0.725838 
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
GTGGACCCTG TATTCTCTAT CGGTATCTCA TCATTATGGG ATGAGCTGCG ACATATGCCA 
GCAGGCGGCG TCTGGTGGTT TAACGTCGAT CGCCATGAAG ATGCTATCAG TCTGGCGAAT
CAAACAATTG CATCCCAGGC TGAAACCGCA CACGTCGCGG TCATTAGCAT GGACAGCGAT
CCAGCGAAAA TCTTTCAATT AGATGATTCT CAAGGGCCGG GAAAAATAAC ATTATTTTCA
ATGCTAAATC ATGAAAAAGG TCTATACTAT TTGGCCCGTG ATTTGCAGTG TTCTATTGAT
CCCCATAATT ACCTTTTTAT TCTTGTTTGC GCAAATAACG CATGGCAAAA CATTCCTGCC
GAGCGGCTTC GCTCATGGTT GGATAAAATG AATAAATGGA GCCGGCTAAA CCATTGTTCG
CTTTTGGTAA TTAATTCCGG AAATAATAAC GATAAACAAT TTTCATTGTT ACTTGAGGAA
TACCGTTCAC TTTTTGGTCT TGCCAGTTTG CGTTTTCAGG GCGACCAACA TTTGCTGGAT
ATTGCTTTCT GGTGCAACGA AAAAGGGGTC AGCGCCCGTC AGCAGCTTAG CGTTCAGCAA
CAAAATGGTT GCTGGACATT AGTTCAACAC CAAGAGGCGG AAATCCAACC ACGCAGCGAC
GAAAAACGCA TTCTGAGTAA TGTTTCTGTA CTTGAAGGTG CGCCGCCGCT ATCGGAACAC
TGGCAACTGT TCAACAATAA CGAAGTCCTG TTTAATGAAG CCCGTACCGC TCAGGCGGCG
ACGGTGGTCT TTTCTTTACA ACAAAATGCG CAAATCGAGC CACTGGCCCG CAGCATTCAT
ACTCTGCGTC GCCAGCGCGG TAGTGCGATG AAAATCCTCG TACGGGAAAA TACCGCTAGC
CTGCGCGCCA CCGATGAACG TTTGTTATTG GCCTGCGGTG CAAATATGGT TATCCCATGG
AATGCCCCAC TCTCCCGCTG TCTGACGATG ATCGAAAGCG TGCAAGGGCA GAAGTTTAGT
CGCTATGTGC CGGAAGATAT CACTACCTTG CTGTCAATGA CCCAGCCGCT CAAACTGCGT
GGTTTCCAGA AGTGGGATGT GTTCTGTAAT GCCGTCAACA ACATGATGAA TAACCCTCTA
TTACCTGCCC ACGGTAAAGG CGTTCTGGTT GCCCTACGTC CGGTACCGGG TATCCGCGTT
GAGCAAGCCC TGACGCTATG TCGCCCTAAT CGCACTGGCG ATATCATGAC CATTGGCGGT
AATCGGCTGG TGCTGTTTCT CTCATTCTGT CGGATTAACG ATCTGGATAC CGCGTTGAAT
CATATTTTCC CATTGCCGAC TGGCGACATT TTCTCAAACC GTATGGTCTG GTTTGAAGAT
GATCAAATCA GTGCCGAGCT GGTGCAGATG CGCCTGCTTG CCCCAGAACA ATGGGGCATG
CCGCTGCCTT TAACGCAAAG TTCTAAACCG GTCATCAATG CCGAGCACGA TGGTCGCCAC
TGGCGACGAA TACCAGAACC AATGCGACTG TTAGATGATG CTGTGGAGCG CTCATCATGA
 
Protein sequence
MDPVFSIGIS SLWDELRHMP AGGVWWFNVD RHEDAISLAN QTIASQAETA HVAVISMDSD 
PAKIFQLDDS QGPGKITLFS MLNHEKGLYY LARDLQCSID PHNYLFILVC ANNAWQNIPA
ERLRSWLDKM NKWSRLNHCS LLVINSGNNN DKQFSLLLEE YRSLFGLASL RFQGDQHLLD
IAFWCNEKGV SARQQLSVQQ QNGCWTLVQH QEAEIQPRSD EKRILSNVSV LEGAPPLSEH
WQLFNNNEVL FNEARTAQAA TVVFSLQQNA QIEPLARSIH TLRRQRGSAM KILVRENTAS
LRATDERLLL ACGANMVIPW NAPLSRCLTM IESVQGQKFS RYVPEDITTL LSMTQPLKLR
GFQKWDVFCN AVNNMMNNPL LPAHGKGVLV ALRPVPGIRV EQALTLCRPN RTGDIMTIGG
NRLVLFLSFC RINDLDTALN HIFPLPTGDI FSNRMVWFED DQISAELVQM RLLAPEQWGM
PLPLTQSSKP VINAEHDGRH WRRIPEPMRL LDDAVERSS