Gene EcSMS35_3909 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagEcSMS35_3909 
Symbol 
ID6145085 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameEscherichia coli SMS-3-5 
KingdomBacteria 
Replicon accessionNC_010498 
Strand
Start bp3979336 
End bp3981306 
Gene Length1971 bp 
Protein Length656 aa 
Translation table11 
GC content54% 
IMG OID641618735 
Producthypothetical protein 
Protein accessionYP_001745874 
Protein GI170681898 
COG category[S] Function unknown 
COG ID[COG3533] Uncharacterized protein conserved in bacteria 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clones21 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones47 
Fosmid unclonability p-value0.921384 
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGAACATTT CGGAAGTCGA TCTGCATAAA CTGACGGTCA GCGATCCGTT CCTCGGTCAG 
TACCAACAAC TGGTCCGCGA TGTGGTGATT CCTTATCAGT GGGACGCCTT GAACGATCGT
ATCCCAGAAG CGGAACCCAG CCATGCGATT GAAAACTTTC GCATTGCCGC AGGACTGCAA
GACGGTGAAT TTTACGGGAT GGTGTTTCAG GACAGCGACG TCGCCAAATG GCTGGAAGCG
GTAGCCTGGT CGCTGTGCCA GAAGCCGGAC GCCGAACTGG AAAAAACCGC CGACGAGGTG
ATTGAACTGA TCGCCTCCGC CCAGTGTGAA GATGGCTATC TCAATACTTA CTTTACGGTA
AAAGCGCCCG AAGAACGCTG GAGCAATCTG GCGGAGTGTC ATGAACTTTA CTGCGCGGGT
CATCTGATTG AAGCCGGAGT CGCCTTCTTC CAGGCTACGG GCAAGCGGCG CTTGCTGGAA
GTGGTTTGCC GTCTGACCGA TCATATCGAC AGCGTATTTG GTCCAGATGA AAGTAAGTTA
CACGGTTATC CTGGTCACCC GGAAATTGAA CTGGCACTAA TGCGCCTGTA TGAAGTGACC
GAAGAGCCGC GCTACCTGGC GCTGACGAAC TATTTTGTCG AACAGCGTGG TGCGCAACCG
CACTATTACG ACCAGGAATA TGAAAAGCGC GGGCAGACCT CGCACTGGCA CACCTACGGC
CCGGCGTGGA TGGTGAAAGA CAAAGCCTAC AGCCAGGCAC ATTTGCCCAT CGCACAGCAG
CAAACCGCCA TTGGTCACGC GGTACGTTTT GTCTACCTGA TGACCGGCGT CGCGCATCTC
GCGCGTTTAA GTCACGATGA AAGCAAGCGT CAGGATTGCT TGCGGCTGTG GAACAATATG
GCCCAGCGTC AGTTATATAT TACCGGCGGC ATTGGCTCAC AGAGCAGCGG TGAAGCGTTC
AGCAGCGATT ACGATCTGCC GAATGACACG GTATACGCCG AAAGCTGTGC TTCCATCGGC
CTGATGATGT TCGCCCGGCG AATGCTGGAA ATGGAAGGCG ACAGTCAATA TGCCGATGTG
ATGGAGCGCG CACTGTACAA CACCGTGCTC GGCGGCATGG CGCTGGATGG CAAACATTTC
TTCTATGTGA ATCCGCTGGA AGTACATCCA AAATCGCTGA AATTCAACCA TATCTACGAT
CACGTTAAGC CGATCCGCCA GCGTTGGTTT GGTTGCGCTT GTTGTCCGCC AAATATCGCC
CGCGTGCTAA CCTCGATTGG TCATTATCTC TACACGCCGC GTGAAGATGC GTTGTATATC
AACATATACG CAGGAAACAG CATGGAAGTG CCGGTAGAAA ATGGCACGTT GCGCCTGCGG
GTTAGCGGAA ACTATCCGTG GCAGGAACAG GTGACAATTG CGGTTGAATC GCCCCAGCCG
GTGCGTCATA CGCTGGCTTT ACGTCTGCCG GACTGGTGCA CACAGCCGCA GATTACATTG
AATGGGGAAG AGGTCGAGCA GGATATTCGT AAAGGGTATT TGCACATTAC CCGCGAATGG
CAGGAGGGCG ACACGCTGAA TCTGACGTTG CCAATGCCGG TACGTCGCGT TTACGGTAAC
CCGCTGGTGC GTCACGTCGC CGGAAAAGTG GCGATTCAGC GCGGCCCGCT GGTGTATTGC
CTGGAACAGG CCGACAACGG CGAGTCACTG CATAACCTGT GGCTGCCCGC CGATGCGCCA
TTTACGACAT TTGAAGGCAA GGGATTGTTC CGCCATAAGA TCTTAATCCA GGCACCGGGT
TACCGGTATG AACAGAGCAA TCCAGAGCAG CAACCGCTGT GGCATTACGA CTATGCCCCA
GCCAAACGCC AGCCGCAAAC TCTGACATTT ATCCCGTGGT TTAGCTGGGC TAACCGGGGT
GAAGGCGAAA TGCGGATCTG GGTGAATGAG GAAAAGCATT GCCATCCGTA G
 
Protein sequence
MNISEVDLHK LTVSDPFLGQ YQQLVRDVVI PYQWDALNDR IPEAEPSHAI ENFRIAAGLQ 
DGEFYGMVFQ DSDVAKWLEA VAWSLCQKPD AELEKTADEV IELIASAQCE DGYLNTYFTV
KAPEERWSNL AECHELYCAG HLIEAGVAFF QATGKRRLLE VVCRLTDHID SVFGPDESKL
HGYPGHPEIE LALMRLYEVT EEPRYLALTN YFVEQRGAQP HYYDQEYEKR GQTSHWHTYG
PAWMVKDKAY SQAHLPIAQQ QTAIGHAVRF VYLMTGVAHL ARLSHDESKR QDCLRLWNNM
AQRQLYITGG IGSQSSGEAF SSDYDLPNDT VYAESCASIG LMMFARRMLE MEGDSQYADV
MERALYNTVL GGMALDGKHF FYVNPLEVHP KSLKFNHIYD HVKPIRQRWF GCACCPPNIA
RVLTSIGHYL YTPREDALYI NIYAGNSMEV PVENGTLRLR VSGNYPWQEQ VTIAVESPQP
VRHTLALRLP DWCTQPQITL NGEEVEQDIR KGYLHITREW QEGDTLNLTL PMPVRRVYGN
PLVRHVAGKV AIQRGPLVYC LEQADNGESL HNLWLPADAP FTTFEGKGLF RHKILIQAPG
YRYEQSNPEQ QPLWHYDYAP AKRQPQTLTF IPWFSWANRG EGEMRIWVNE EKHCHP