Gene EcSMS35_1190 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagEcSMS35_1190 
Symbol 
ID6143991 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameEscherichia coli SMS-3-5 
KingdomBacteria 
Replicon accessionNC_010498 
Strand
Start bp1195927 
End bp1197132 
Gene Length1206 bp 
Protein Length401 aa 
Translation table11 
GC content54% 
IMG OID641616068 
ProductHK97 family phage major capsid protein 
Protein accessionYP_001743251 
Protein GI170681147 
COG category[R] General function prediction only 
COG ID[COG4653] Predicted phage phi-C31 gp36 major capsid-like protein 
TIGRFAM ID[TIGR01554] phage major capsid protein, HK97 family 


Plasmid Coverage information

Num covering plasmid clones23 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones16 
Fosmid unclonability p-value0.00000012399 
Fosmid HitchhikerYes 
Fosmid clonabilityhitchhiker 
 

Sequence

Gene sequence
ATGGCGGTTG ATATTAAAGA TGTCGAACAG GTCGCGCAGG AGCTGCAGCA GAAGTTTGAC 
GACTTCAAAG CAAAGAACGA CAAGCGCGTG GATGCGATTG AGCAGGAAAA AGGCAAACTT
GCCGGGCAGG TGGAAACCCT GAACGGGAAA CTCAGCGAGC TGGAAAACCT CAAAAGCGAT
CTTGAAAAAG AGCTGCTTGA GCTGAAACGT CCGGCAGGTG GTGCGCAAAA TAAACTGGCC
ACCGAGCATA AAGAAGCGTT TGTGGGCTTC CTGCGTAAAG GCCGTGAAGA TGGTCTGCGC
GATCTGGAGC GCAAGGCATT ACAGGTGGGC ACCGATGAAG ACGGCGGCTA TGCCGTGCCG
GAAGCACTGG ATCGCAACAT TCTCACCCTG CTGAAAGATG AAGTGGTGAT GCGCCAGGAA
GCCACGGTGA TCAGCGTTGG TGGTTCCGAC TACAAAAAAC TGGTGAATCT GGGCGGCACG
GCTTCCGGAT GGGTTGGCGA GACTGACGCG CGCTCCCAGA CTGCCACCTC AAAACTGGGC
CTGATTGAAC CTTTCATGGG GGAAATCTAC GGTAACCCGC AGGCCACCCA GAAAATGCTG
GATGATGCCT TTTTCAACGT GGAAGCATGG ATCAACAGCG AGCTGGCAAC CGAATTTGCC
GAACAGGAAG AAATTGCCTT TACCACCGGC GATGGTACCA AGAAGCCGAA AGGGTTCCTG
GCGTATGAGT CCACGGATGA AACAGACAAG GTCCGGGCGT TCGGCAAACT TCAGCATATT
GTATCCGGCG AAGCGACGGC GGTGACCGCA GATGCCATTA TCAAACTGAT TTACACGCTG
CGTAAGGCAC ACCGCACTGG CGCGAAGTTC ATGATGAACA ACAACAGTCT GTTTGCCATC
CGTCTGCTGA AAGACAGTGA GGGTAACTAT CTGTGGCGTC CGGGGCTGGA GCTGGGGCAG
CCGTCCTCTC TGGCGGGTTA CGCTATCGCT GAAAACGAAC AGATGCCGGA TATTGCCGCT
GATGCGAAAG CCATTGCATT TGGTAACTTC AAACGGGGTT ACACCATCGT TGACCGTATC
GGTACCCGCA TTCTGCGTGA CCCGTACACC AATAAACCGT TTGTCGGTTT TTATACCACC
AAGCGCACCG GCGGCATGCT GGTCGATTCG CAGGCCATCA AACTGCTGAA GATTGCAGCG
GCGTAA
 
Protein sequence
MAVDIKDVEQ VAQELQQKFD DFKAKNDKRV DAIEQEKGKL AGQVETLNGK LSELENLKSD 
LEKELLELKR PAGGAQNKLA TEHKEAFVGF LRKGREDGLR DLERKALQVG TDEDGGYAVP
EALDRNILTL LKDEVVMRQE ATVISVGGSD YKKLVNLGGT ASGWVGETDA RSQTATSKLG
LIEPFMGEIY GNPQATQKML DDAFFNVEAW INSELATEFA EQEEIAFTTG DGTKKPKGFL
AYESTDETDK VRAFGKLQHI VSGEATAVTA DAIIKLIYTL RKAHRTGAKF MMNNNSLFAI
RLLKDSEGNY LWRPGLELGQ PSSLAGYAIA ENEQMPDIAA DAKAIAFGNF KRGYTIVDRI
GTRILRDPYT NKPFVGFYTT KRTGGMLVDS QAIKLLKIAA A