Gene EcSMS35_0951 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagEcSMS35_0951 
Symbol 
ID6146879 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameEscherichia coli SMS-3-5 
KingdomBacteria 
Replicon accessionNC_010498 
Strand
Start bp963760 
End bp964929 
Gene Length1170 bp 
Protein Length389 aa 
Translation table11 
GC content51% 
IMG OID641615838 
ProductHK97 family phage major capsid protein 
Protein accessionYP_001743030 
Protein GI170681140 
COG category[R] General function prediction only 
COG ID[COG4653] Predicted phage phi-C31 gp36 major capsid-like protein 
TIGRFAM ID[TIGR01554] phage major capsid protein, HK97 family 


Plasmid Coverage information

Num covering plasmid clones24 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones60 
Fosmid unclonability p-value
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGAAGAAAT TAATCGAACT CCGCCAGCAA AAAAACGCCC TGAAAAACCA GATGCGATCC 
CTGCTGGAAA AAGCCGACAG TGAAAACCGT AGTCTGAACG CTGAAGAAGG CAAACAGTTT
GATGAACTGC GTGCAAAAGC TGATGCCCTC GACACAGAAA TTTCCCGCCT CGAGTCTGTG
GCTGATGAAG AACGCAGCAA GCCAGGAACG GGCATCCAGA AATTATCATC TGATGAATTG
CGTAACTACA TCGTAACCGG AGATGTGCGA TCACTGTCCA CCAGCACTGA CAGCGGCAGG
GATGGCGGAT ATACCGTAAT TCCTGAGCTT GATCGCGAAG TCATGCGCCA GCTACAGGAT
GACAGTGTTA TGCGCGTGAT CGCGACCGTG AAGACCGCAA AATCAAATGA GTTTCAGAAA
CTGGTTTCCA CTGGCGGCGC AACTGTAGGA CGAGGCACAG AAGGCAGCGC ACGCAGTGAA
ACCAACACCC CGAAAATTGA ACGCGTAACC ATCAAGTTGA ATCCGATCTA CGCCTACCCG
AAAACCACGC AGGAAATCCT GGATTTTTCA GAGGTGGATA TTCTGGGCTG GTTATCCTCC
GAAATTGCCG ACACGTTCGC CAGCACCGAA GAGGATGATT TTGTTAATGG CGACGGTAAC
GGCAAGCCGA AAGGCTTCAT GGCTTACACC CGTGCGGCGA CCAGTGACAA AACCCGCGCT
TTTGGCACCA TTGAAAAAAT AGTAGCGGCA AGTGGAACCG CCATTACAGC GGACGAACTG
ATCGACATTC TCTACAAGCT GAAAGCGAAA TACCGCAAAA ATGCCGTCTG GGTGATGAAC
TCGGGCACGG CAGGGACACT ACAGAAGCTG AAAAATGAGA ACGGCGATTA TATCTGGCGC
GACAGCCTTA AAGAAGGTGC GCCGGATATG TTGCTTGGTC GTCCTGTTTA CTGCCTGGAG
TCCATGCCGG ACATCGGCGC AGGAAAAGCA CCGCTAGCGG TTGGCGATTT CAGTCGTGGT
TATTTCATCG TTGATCATGT AACAGGGATT CGCACCCGAC CGGACAACAT TACTGAACCC
GGATTCTACA AGGTCCACAC GGATAAATAT CTGGGCGGTG GTGTGGTGGA TTCAAACGCC
ATCAAAATTC TGGAAATGAA AGCTGGCTAG
 
Protein sequence
MKKLIELRQQ KNALKNQMRS LLEKADSENR SLNAEEGKQF DELRAKADAL DTEISRLESV 
ADEERSKPGT GIQKLSSDEL RNYIVTGDVR SLSTSTDSGR DGGYTVIPEL DREVMRQLQD
DSVMRVIATV KTAKSNEFQK LVSTGGATVG RGTEGSARSE TNTPKIERVT IKLNPIYAYP
KTTQEILDFS EVDILGWLSS EIADTFASTE EDDFVNGDGN GKPKGFMAYT RAATSDKTRA
FGTIEKIVAA SGTAITADEL IDILYKLKAK YRKNAVWVMN SGTAGTLQKL KNENGDYIWR
DSLKEGAPDM LLGRPVYCLE SMPDIGAGKA PLAVGDFSRG YFIVDHVTGI RTRPDNITEP
GFYKVHTDKY LGGGVVDSNA IKILEMKAG