Gene EcSMS35_4349 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagEcSMS35_4349 
Symbol 
ID6142615 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameEscherichia coli SMS-3-5 
KingdomBacteria 
Replicon accessionNC_010498 
Strand
Start bp4437817 
End bp4440264 
Gene Length2448 bp 
Protein Length815 aa 
Translation table11 
GC content56% 
IMG OID641619170 
ProductTP901 family phage tail tape measure protein 
Protein accessionYP_001746294 
Protein GI170683855 
COG category[S] Function unknown 
COG ID[COG5283] Phage-related tail protein 
TIGRFAM ID[TIGR01760] phage tail tape measure protein, TP901 family, core region 


Plasmid Coverage information

Num covering plasmid clones22 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones38 
Fosmid unclonability p-value0.186092 
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGAGTAACA ATGTAAAATT ACAGGTATTG CTCAGGGCTG TTGACCAGGC ATCCCGCCCG 
TTTAAATCCA TCCGCACAGC GAGCAAATCG CTGTCGGGGG ATATCCGGGA AACACAAAAA
TCACTGCGTG AGCTGAACGG TCAGGCATCC CGTATTGAGG GATTTCGCAA GACCAGTGCA
CAGCTCGCCG TGACTGGTCA TGCACTTGAA AAGGCACGGC AGGAGGCCGA AGCCCTTGCC
ACACAGTTTA AAAACACCGA ACGTCCGACA CGTGCTCAGG CGAAAGTGCT GGAATCCGCG
AAGCGTGCGG CGGAGGACTT ACAGGCGAAA TATAACCGCC TGACGGATTC CGTTAAACGC
CAGCAGCGGG AACTGGCCGT TGTGGGAATT AATACCCGCA ATCTTGCACA TGATGAGCAG
GGACTGAAAA ACCGTATCAG TGAAACCACC GCACAGCTTA ACCGTCAGCG TGACGCGCTG
GCGCGTGTCA GTGCGCAACA GGCAAAACTT AACGCAGTAA AACAGCGTTA TCAGGCAGGC
AAGGAACTGG CCGGAAATAT GGCCTCAGTG GGCGCTGCCG GTGTGGGGAT TGCGGCGGCG
GGAACGATGG CCGGAGTTAA GTTGCTGATG CCAGGTTATG AGTTTGCGCA GAAAAACTCA
GAATTGCTGG CCGTGCTCGG AGTGGCAAAA GACTCCGCCG AAATGACCGC ACTACGCAAA
CAGGCGCGCC AGCTCGGCGA CAATACCGCC GCCTCGGCGG ATGATGCGGC CGGTGCACAG
ATAATCATCG CGAAAGCGGG TGGGGATGTT GATGCCATTC AGGCGGCAAC GCCGGTCACG
CTGAATATGG CGCTGGCGAA CCGCCGCACG ATGGAAGAAA ACGCCGCCCT GTTGATGGGG
ATGAAATCCG CCTTTCAGCT TTCAAACGAT AAGGTCGCTC ATATCGGGGA TGTTCTCTCC
ATGACGATGA ACAAAACCGC CGCCGATTTT GACGGCATGA GCGATGCGCT GACCTATGCC
GCACCTGTGG CAAAAAATGC CGGTGTCAGC ATTGAAGAAA CCGCCGCAAT GGTCGGGGCG
CTGCATGATG CAAAAATCAC AGGCTCAATG GCGGGGACGG GAAGCCGTGC CGTGTTAAGC
CGCCTGCAGG CACCGACGGG AAAAGCATGG GATGCACTCA AAGAGCTTGG AGTGAAAACC
TCAGACAGCA AGGGAAACAC CCGGCCAATA TTTACCATTC TGAAAGAAAT GCAGGCCAGT
TTTGAGAAAA ACCGGCTCGG TACTGCCCAG CAGGCTGAAT ACATGAAAAC TATTTTCGGG
GAGGAGGCCA GCTCAGCCGC CGCCGTGCTG ATGACTGCCG CCTCAACCGG AAAGCTGGAC
AAACTGACCG CTGCGTTTAA ATCCTCAGAC GGGAAGACCG CCGAGCTGGT AAATATCATG
CAGGACAACC TAGGCGGTGA CTTTAAAGAG TTTCAGTCCG CTTATGAGGC AGTGGGGACT
GACCTGTTTG ACCAGCAGGA AGGCGCGCTG CGTAAGCTCA CGCAGACGAC CACAAAGTAT
GTGTTAAAAC TCGACGGCTG GATCCAGAAA AACAAATCAC TGGCGTCAAC CATCGGCATC
ATTGCCGGTG GCGCGCTGGC GCTGACTGGC ATCATCGGTG CCATTGGCCT CGTAGCCTGG
CCGGTTATCA CCGGCATCAA TGCCATCATC GCGGCAGCAG GCGCAATGGG GGCAGTCTTC
ACGACGGTTG GCAGTGCTGT TATGACCGCC ATCGGGGTTA TTAGCTGGCC GGTTGTGGCC
GTGGTGGCCG CAATTGTCGC CGGGGCGTTG CTTATCCGTA AATACTGGGA GCCTGTCAGC
GCATTCTTTG GCGGTGTGGT TGAAGGGCTG AAAGTGGCAT TTGCGCCGGT GGGGGAACTG
TTCACGCCAC TTAAGCCGGT GTTTGACTGG CTGGGTGAAA AGTTACAGGC CGCGTGGCAG
TGGTTTAAAA ACCTGATTGC CCCGGTCAAA GCCACTCAGG ACTCCCTGAA CAGTTGCCGT
GACACGGGGG TCATGTTCGG GCAGGCACTG GCTGACGCGC TGATGCTGCC GCTTAATGCG
TTCAACAAAC TGCGCAGCGG TATTGACTGG GTACTGGAAA AACTCGGTGT TATCAACAAA
GAGTCAGACA CACTTGACCA GACCGCCGCC AGAACTCAAG CCGCCACGTA TGGCAGCGGT
GGTTATATTC CGGCGACCAG CTCTTATGCA GGCTATCAGG CTTATCAGCC GGTCACGGCA
CCGGCTGGCC GCTCTTATGT GGACCAGAGT AAAAACGAAT ATCACATCAG CCTGACGGGT
GGTACTGCGC CGGGGACACA GCTCGACCGC CAGTTACAGG ATGCGCTCGA AAAATACGAG
CGGGATAAAC GTGTGCGCGC CCGTGCCAGC ATGATGCATG ACGGTTAA
 
Protein sequence
MSNNVKLQVL LRAVDQASRP FKSIRTASKS LSGDIRETQK SLRELNGQAS RIEGFRKTSA 
QLAVTGHALE KARQEAEALA TQFKNTERPT RAQAKVLESA KRAAEDLQAK YNRLTDSVKR
QQRELAVVGI NTRNLAHDEQ GLKNRISETT AQLNRQRDAL ARVSAQQAKL NAVKQRYQAG
KELAGNMASV GAAGVGIAAA GTMAGVKLLM PGYEFAQKNS ELLAVLGVAK DSAEMTALRK
QARQLGDNTA ASADDAAGAQ IIIAKAGGDV DAIQAATPVT LNMALANRRT MEENAALLMG
MKSAFQLSND KVAHIGDVLS MTMNKTAADF DGMSDALTYA APVAKNAGVS IEETAAMVGA
LHDAKITGSM AGTGSRAVLS RLQAPTGKAW DALKELGVKT SDSKGNTRPI FTILKEMQAS
FEKNRLGTAQ QAEYMKTIFG EEASSAAAVL MTAASTGKLD KLTAAFKSSD GKTAELVNIM
QDNLGGDFKE FQSAYEAVGT DLFDQQEGAL RKLTQTTTKY VLKLDGWIQK NKSLASTIGI
IAGGALALTG IIGAIGLVAW PVITGINAII AAAGAMGAVF TTVGSAVMTA IGVISWPVVA
VVAAIVAGAL LIRKYWEPVS AFFGGVVEGL KVAFAPVGEL FTPLKPVFDW LGEKLQAAWQ
WFKNLIAPVK ATQDSLNSCR DTGVMFGQAL ADALMLPLNA FNKLRSGIDW VLEKLGVINK
ESDTLDQTAA RTQAATYGSG GYIPATSSYA GYQAYQPVTA PAGRSYVDQS KNEYHISLTG
GTAPGTQLDR QLQDALEKYE RDKRVRARAS MMHDG