Gene EcSMS35_1188 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagEcSMS35_1188 
Symbol 
ID6144955 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameEscherichia coli SMS-3-5 
KingdomBacteria 
Replicon accessionNC_010498 
Strand
Start bp1194043 
End bp1195284 
Gene Length1242 bp 
Protein Length413 aa 
Translation table11 
GC content56% 
IMG OID641616066 
ProductHK97 family phage portal protein 
Protein accessionYP_001743249 
Protein GI170682704 
COG category[S] Function unknown 
COG ID[COG4695] Phage-related protein 
TIGRFAM ID[TIGR01537] phage portal protein, HK97 family 


Plasmid Coverage information

Num covering plasmid clones23 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones14 
Fosmid unclonability p-value0.0000000136411 
Fosmid HitchhikerYes 
Fosmid clonabilityhitchhiker 
 

Sequence

Gene sequence
GTGTTCTTTT CGGGATTATT TCAACGAAAA AGTGATGCGC CGGTGACCAC GCCAGCAGAG 
CTGGCGGATG CCATCGGGTT GTCTTACGAC ACCTATACCG GAAAGCAGAT CAGCAGTCAG
CGGGCCATGC GACTGACGGC GGTTTTTTCC TGCGTCAGAG TGCTGGCAGA GTCGGTCGGG
ATGTTGCCCT GCAATCTGTA TCACCTGAAC GGCAGCCTGA AACAGAGGGC CACCGGCGAA
CGTCTGCATA AGCTGATCTC CACGCATCCC AATGGCTATA TGACGCCGCA GGAGTTCTGG
GAGCTGGTGG TCACCTGTCT GTGCCTGAGG GGAAACTTTT ACGCCTACAA AGTGAAAGCA
TTTGGCGAAG TGGCTGAACT GCTGCCCGTC GATCCCGGTT GTGTGGTACC GAAGCTTAAC
AGTCGCTGGG AGCCGGTCTA TCAGGTCACA TTCCCGGACG GTTCCACGGA TGTACTGAGC
CAGGAAGATA TCTGGCATGT GCGCACGCTG ACGCTGGACG GTCTGGTGGG ACTGAATCCC
ATCGCCTATG CCCGCGAGGC AATATCGCTG GCAGCAGCGA CCGAAGAGCA CGGGGCCAGA
CTGTTCAGCA ATGGTGCGGT GACGTCCGGT GTGTTGCGTA CAGAACAGAC GCTGTCGGAT
CAGGCTTATG AGCGCCTGAA GAAAGATTTT GAGGAGCGTC ACACCGGGCT TGGCAATGCT
CACCGCCCGA TGATCCTTGA GATGGGGCTG GACTGGAAGT CGATGGCGCT GAACGCCGAG
GACAGCCAGT TCCTGGAAAC CCGCAAGTTT CAGCTTGAAG AAATCTGTCG TCTGTTCCGG
GTGCCGTTGC ACATGGTGCA GAACACCGAT CGCGCCACCT TCAACAATAT CGAAGAGCTG
GGGCTGGGAT TTATCAACTA TTCACTGGTG CCGTATCTGA CCCGCATCGA ACAGCGGATC
AACACCGGAC TGGTACGAAA AAGTAAGCAG GGCGTTTATT ACGCCAAATT TAACGCCGGG
GCGTTACTGC GCGGGGATAT GAAGTCCCGT TTTGAAGCCT ACGCCACCGG GATCAACTGG
GGAATTTACT CTCCCAATGA CTGCCGCGAC CTGGAAGATA TGAATCCGCG TCCCGGTGGG
GATGTCTATC TCACACCGAT GAACATGACC ACGAAACCCT CCGATGGCAG TAAAGCCGGT
AAGCAGAAGG ATAACGCCAA TGCAGACGAA ACAACGTCTT GA
 
Protein sequence
MFFSGLFQRK SDAPVTTPAE LADAIGLSYD TYTGKQISSQ RAMRLTAVFS CVRVLAESVG 
MLPCNLYHLN GSLKQRATGE RLHKLISTHP NGYMTPQEFW ELVVTCLCLR GNFYAYKVKA
FGEVAELLPV DPGCVVPKLN SRWEPVYQVT FPDGSTDVLS QEDIWHVRTL TLDGLVGLNP
IAYAREAISL AAATEEHGAR LFSNGAVTSG VLRTEQTLSD QAYERLKKDF EERHTGLGNA
HRPMILEMGL DWKSMALNAE DSQFLETRKF QLEEICRLFR VPLHMVQNTD RATFNNIEEL
GLGFINYSLV PYLTRIEQRI NTGLVRKSKQ GVYYAKFNAG ALLRGDMKSR FEAYATGINW
GIYSPNDCRD LEDMNPRPGG DVYLTPMNMT TKPSDGSKAG KQKDNANADE TTS