Gene EcSMS35_4322 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagEcSMS35_4322 
Symbol 
ID6147244 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameEscherichia coli SMS-3-5 
KingdomBacteria 
Replicon accessionNC_010498 
Strand
Start bp4419087 
End bp4420121 
Gene Length1035 bp 
Protein Length344 aa 
Translation table11 
GC content53% 
IMG OID641619143 
ProductPBSX family phage portal protein 
Protein accessionYP_001746267 
Protein GI170683575 
COG category[R] General function prediction only 
COG ID[COG5518] Bacteriophage capsid portal protein 
TIGRFAM ID[TIGR01540] phage portal protein, PBSX family 


Plasmid Coverage information

Num covering plasmid clones23 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones30 
Fosmid unclonability p-value0.00346629 
Fosmid HitchhikerYes 
Fosmid clonabilityhitchhiker 
 

Sequence

Gene sequence
ATGAGCAAGA AAAAAGGGAA AACACCGCAA CCTGCGGCAA AAAAAATGAC CGCCAGCGCC 
CCGAAAATGG AGGCATTCAC CTTTGGTGAG CCGGTGCCGG TACTCGACCG CCGTGACATT
CTGGATTACG TCGAGTGCAT CAGTAACGGC AGATGGTATG AGCCACCAGT CAGCTTTACC
GGTCTGGCAA AAAGCCTGCG TGCTGCCGTG CATCACAGCT CACCGATTTA CGTCAAACGT
AACATTCTGG CCTCAACGTT TATCCCACAC CCGTGGCTTT CTCAGCAGGA TTTCAGCCGC
TTTGTGCTGG ATTTTCTGGT ATTCGGTAAT GCGTTTCTGG AAAAGCGTTA CAGCACCACC
GGTAAGGTCA TCAGACTGGA AACCTCACCG GCAAAATATA CCCGCCGTGG CGTGGAGGAG
GATGTTTACT GGTGGGTGCC GTCCTTCAAC GAGCCGACAC CTTTCGCGCC CGGCTCCGTG
TTTCACCTGC TGGAGCCGGA TATTAATCAG GAGCTGTACG GTCTGCCGGA ATATCTCAGC
GCCCTTAACT CTGCCTGGCT GAATGAGTCG GCCACGCTGT TCCGCCGCAA GTATTACGAA
AACGGCGCTC ATGCCGGATA TATCATGTAC GTCACTGATG CCGTGCAGGA TCGCAACGAT
ATCGAAATGC TTCGCGAAAA CATGGTGAAG TCGAAAGGCC GCAACAACTT TAAAAACCTG
TTTCTCTATG CCCCGCAGGG GAAAGCTGAC GGCATTAAAA TTATCCCGCT CAGTGAAGTG
GCAACGAAGG ACGATTTTTT TAATATCAAA AAAGCCAGCG CCGCTGACCT GCTGGACGCG
CACCGCATCC CCTTTCAGTT GATGGGCGGC AAGCCGGAGA ACGTCGGGTC GCTGGGTGAT
ATTGAGAAAG TAGCAAAGGT CTTTGTCCGC AATGAGCTTA TCCCGTTACA GGACAGGATC
CGCGAGATAA ACGGCTGGCT CGGTCAGGAG GTCATCCGAT TTAAAAACTA CTCACTGGAC
ACTGACAACG GCTGA
 
Protein sequence
MSKKKGKTPQ PAAKKMTASA PKMEAFTFGE PVPVLDRRDI LDYVECISNG RWYEPPVSFT 
GLAKSLRAAV HHSSPIYVKR NILASTFIPH PWLSQQDFSR FVLDFLVFGN AFLEKRYSTT
GKVIRLETSP AKYTRRGVEE DVYWWVPSFN EPTPFAPGSV FHLLEPDINQ ELYGLPEYLS
ALNSAWLNES ATLFRRKYYE NGAHAGYIMY VTDAVQDRND IEMLRENMVK SKGRNNFKNL
FLYAPQGKAD GIKIIPLSEV ATKDDFFNIK KASAADLLDA HRIPFQLMGG KPENVGSLGD
IEKVAKVFVR NELIPLQDRI REINGWLGQE VIRFKNYSLD TDNG