Gene EcSMS35_1203 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagEcSMS35_1203 
Symbol 
ID6145591 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameEscherichia coli SMS-3-5 
KingdomBacteria 
Replicon accessionNC_010498 
Strand
Start bp1205934 
End bp1209413 
Gene Length3480 bp 
Protein Length1159 aa 
Translation table11 
GC content57% 
IMG OID641616081 
Productfibronectin type III domain-containing protein 
Protein accessionYP_001743264 
Protein GI170683152 
COG category[S] Function unknown 
COG ID[COG4733] Phage-related protein, tail component 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clones
Plasmid unclonability p-value0.272473 
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones30 
Fosmid unclonability p-value0.0174889 
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGGGTAAAG GCAGCAGTAA GGGGCATACC CCGCGCGAAG CGAAGGACAA CCTGAAGTCC 
ACGCAGTTGC TGAGTGTGAT CGATGCCATC AGCGAAGGGC CGATTGACGG TCCGGTGGAT
GGATTAAAAA GCGTGCTGCT GAACAGTACG CCAGTGCTGG ACAGTGAGGG GAATACCAAT
ATCTCCGGTG TCACGGTGGT GTTCCGGGCA GGTGAGCAGG AGCAGACACC GCCGGAGGGA
TTTGAATCCT CCGGCTCCGA GACGGTGCTG GGTACGGAAG TGAAATACGA CACGCCGATC
ACCCGGACCA TCACGTCGGC AAACATTGAC CGACTGCGTT TTACCTTCGG CGTGCAGGCA
CTGGTGGAAA CCACCTCAAA GGGGGACAGG AATCCGTCGG AAGTCCGCCT GCTGGTTCAG
ATACAACGTA ACGGTGGCTG GGTGACGGAA AAAGACATCA CCATTAAAGG CAAAACCACC
TCACAGTATC TGGCCTCGGT GGTGGTGGAT AACCTGCCGC CGCGCCCGTT TAATATCCGG
ATGCGCAGGA TGACGCCGGA CAGCACCACA GACCAGCTGC AGAACAAAAC GCTCTGGTCG
TCATACACCG AAATTATCGA TGTGAAACAG TGCTACCCGA ACACGGCACT GGTCGGCGTG
CAGGTGGATT CGGAGCAGTT CGGCAGCCAG CAGGTGAGCC GTAATTATCA TCTGCGCGGG
CGTATTCTGC AGGTGCCGTC GAATTATAAT CCGCAGACGC GGCAATACAG CGGTATCTGG
GACGGAACGT TTAAACCGGC ATACAGCAAC AACATGGCAT GGTGTCTGTG GGATATGCTG
ACCCACCCGC GCTACGGCAT GGGGAAACGT CTTGGTGCGG CGGATGTGGA CAAATGGGCG
CTGTATGTCA TCGGCCAGTA TTGCGACCAG TCGGTGCCGG ACGGTTTTGG CGGCACGGAG
CCGCGCATCA CCTGTAACGC TTACCTGACC ACACAGCGTA AGGCGTGGGA TGTGCTCAGT
GATTTCTGCT CGGCGATGCG CTGTATGCCG GTATGGAACG GGCAGACGCT GACGTTCGTG
CAGGACCGAC CGTCGGATAA GGTGTGGACC TATAACCGCA GTAATGTGGT GATGCCGGAT
GATGGCGCGC CGTTCCGCTA CAGTTTCAGC GCCCTGAAGG ACCGCCATAA TGCCGTTGAG
GTGAACTGGA TTGACCCGGA TAACGGCTGG GAAACGGCAA CAGAGCTTGT GGAGGACACG
CAGGCCATTG CCCGTTACGG TCGTAATGTC ACGAAGATGG ATGCCTTTGG CTGTACCAGT
CGGGGGCAGG CGCACCGCGC CGGGCTGTGG CTGATTAAAA CGGAACTGCT GGAAACGCAG
ACCGTGGACT TCAGCGTGGG TGCGGAAGGG CTTCGCCATG TACCGGGGGA TGTCATTGAA
ATCTGCGATG ATGACTATGC GGGTATCAGC ACCGGCGGGC GCGTGCTGGC GGTGAACAGC
CAGACCCGGA CGCTGACGCT CGACCGTGAA ATCACGCTGC CATCCTCCGG TATCACGCTG
ATAAGCCTGG TTGACGGAAG TGGCAATCCG GTCAGCGTGG AGGTTCAGTC CGTCACCGAC
GGCGTGAAGG TAAAAGTGAG CCGTGTTCCT GACGGTGTTG CTGAATACAG CGTATGGGGG
CTGAAGCTGC CGACGCTGCG CCAGCGACTG TTCCGCTGCG TGAGTATCCG TGAGAACGAC
GACGGCACGT ATGCCATCAC TGCCGTGCAG CATGTACCGG AGAAAGAAGC CATCGTGGAT
AACGGGGCGT ACTTTGACGG CGACCAGAGC GGAACGGTGA ACGGTGTCAC GCCGCCCGCG
GTGCAGCACC TGACTGCAGA AGTCACCGCA GACAGCGGGG AATACCAGGT GCTGGCGCGC
TGGGACACGC CGAAGGTGGT GAAGGGGGTG AGCTTTATGC TTCGCCTGAC CGTGGCAGCG
GATGACGGCA GTGAGCGGCT GGTCAGCACG GCCAGGACGA CGGAAACCAC ATACCGCTTC
AGGCAACTGG CGCTGGGGCG TTACACGCTG ACGGTCCGGG CGGTAAATGC GTGGGGACAG
CAGGGCGATC CGGCATCGGT ATCGTTCCGG ATTGCCGCAC CGGCAGCGCC GTCGCGGATT
GAGCTGACGC CGGGCTATTT TCAGATAACT GCCACGCCGC ATCTTGCGGT TTATGATCCG
ACGGTACAGT TTGAGTTCTG GTTCTCGGAA AAGCGGATTG CGGATATCAG GCAGGTTGAA
GCCAGCGCGC GTTATCTTGG TACGGCGCTG TACTGGATAG CCGCCAGTAT CAATATCAAA
CCGGGCCATG ATTATTATTT TTACATTCGC AGTGTGAACA CCGTTGGCAA ATCGGCATTC
GTGGAGGCTG TTGGCCAGCC GAGTGATGAT GCATCCGGCT ATCTGGATTT TTTCAAAGGC
GAGATAGGGA AAACCCATCT GGCTCAGGAG CTGTGGACGC AGATTGATAA CGGTCAGCTT
GCGCCTGACC TGGCTGAAAT CAGGACGTCC ATTACGGATG TCAGCAATGA AGTCACGCAG
ACCGTCAATA AGAAACTGGA AGACCAGAGT GCGGCAATTC AGCAGATACA GAAGGTTCAG
GTTGATACAA ATAATAACCT GAACAGCATG TGGGCTGTGA AGCTGCAGCA GATGCAGGAC
GGACGCCTTT ATATCGCGGG TATTGGTGCC GGTATTGAGA ATACCCCTGA CGGCATGCAG
AGTCAGGTGC TGCTGGCGGC GGACAGGATT GCGATGGTTA ATCCTGCGAA TGGCAACACA
AAGCCGATGT TTGTTGGTCA GGGCGATCAG ATATTCATGA ACGAAGTGTT CCTGAAATAT
CTGACGGCTC CCACCATTAC CAGCGGCGGT AATCCTCCGG CATTTTCCCT GACACCGGAC
GGGCGGCTGA CGGCGAAAAA TGCCGATATC AGCGGTAACG TGAATGCGAA CTCCGGGACG
CTCAACAACG TCACGATTAA CGAGAACTGC CGGGTTCTGG GAAAACTGTC CGCGAACCAG
ATTGAAGGCG ATCTCGTTAA AACAGTGGGC AAAGCTTTCC CCCGGGACTC CCGTGCACCG
GAACGCTGGC CATCAGGGAC CATTACCGTC AGGATTTATG ACGATCAGCC GTTTGACCGG
CAGATTGTTA TTCCGGCGGT GGCATTCAGC GGCGCTAAGC ATGAGAGAGA GCATACTGAT
ATTTACTCCT CATGCCGTCT GATAGTGCGG AAAAACGGTG CTGAAATTTA TAACCGTACC
GCGCTGGATA ATACGCTGAT TTACAGTGGC GTTATTGATA TGCCTGCCGG TCACGGTCAC
ATGACGCTGG AGTTTTCGGT GTCAGCATGG CTGGTAAATG ACTGGTATCC CACAGCAAGT
ATCAGCGATT TGCTGGTTGT GGTGATGAAG AAAGCCACCG CAGGCATCAG TATCAGCTGA
 
Protein sequence
MGKGSSKGHT PREAKDNLKS TQLLSVIDAI SEGPIDGPVD GLKSVLLNST PVLDSEGNTN 
ISGVTVVFRA GEQEQTPPEG FESSGSETVL GTEVKYDTPI TRTITSANID RLRFTFGVQA
LVETTSKGDR NPSEVRLLVQ IQRNGGWVTE KDITIKGKTT SQYLASVVVD NLPPRPFNIR
MRRMTPDSTT DQLQNKTLWS SYTEIIDVKQ CYPNTALVGV QVDSEQFGSQ QVSRNYHLRG
RILQVPSNYN PQTRQYSGIW DGTFKPAYSN NMAWCLWDML THPRYGMGKR LGAADVDKWA
LYVIGQYCDQ SVPDGFGGTE PRITCNAYLT TQRKAWDVLS DFCSAMRCMP VWNGQTLTFV
QDRPSDKVWT YNRSNVVMPD DGAPFRYSFS ALKDRHNAVE VNWIDPDNGW ETATELVEDT
QAIARYGRNV TKMDAFGCTS RGQAHRAGLW LIKTELLETQ TVDFSVGAEG LRHVPGDVIE
ICDDDYAGIS TGGRVLAVNS QTRTLTLDRE ITLPSSGITL ISLVDGSGNP VSVEVQSVTD
GVKVKVSRVP DGVAEYSVWG LKLPTLRQRL FRCVSIREND DGTYAITAVQ HVPEKEAIVD
NGAYFDGDQS GTVNGVTPPA VQHLTAEVTA DSGEYQVLAR WDTPKVVKGV SFMLRLTVAA
DDGSERLVST ARTTETTYRF RQLALGRYTL TVRAVNAWGQ QGDPASVSFR IAAPAAPSRI
ELTPGYFQIT ATPHLAVYDP TVQFEFWFSE KRIADIRQVE ASARYLGTAL YWIAASINIK
PGHDYYFYIR SVNTVGKSAF VEAVGQPSDD ASGYLDFFKG EIGKTHLAQE LWTQIDNGQL
APDLAEIRTS ITDVSNEVTQ TVNKKLEDQS AAIQQIQKVQ VDTNNNLNSM WAVKLQQMQD
GRLYIAGIGA GIENTPDGMQ SQVLLAADRI AMVNPANGNT KPMFVGQGDQ IFMNEVFLKY
LTAPTITSGG NPPAFSLTPD GRLTAKNADI SGNVNANSGT LNNVTINENC RVLGKLSANQ
IEGDLVKTVG KAFPRDSRAP ERWPSGTITV RIYDDQPFDR QIVIPAVAFS GAKHEREHTD
IYSSCRLIVR KNGAEIYNRT ALDNTLIYSG VIDMPAGHGH MTLEFSVSAW LVNDWYPTAS
ISDLLVVVMK KATAGISIS