Gene Information Plasmid Coverage information Fosmid Coverage information Sequence |
Gene Information |
Locus tag | EcSMS35_1190 |
Symbol | |
ID | 6143991 |
Type | CDS |
Is gene spliced | No |
Is pseudo gene | No |
Organism name | Escherichia coli SMS-3-5 |
Kingdom | Bacteria |
Replicon accession | NC_010498 |
Strand | + |
Start bp | 1195927 |
End bp | 1197132 |
Gene Length | 1206 bp |
Protein Length | 401 aa |
Translation table | 11 |
GC content | 54% |
IMG OID | 641616068 |
Product | HK97 family phage major capsid protein |
Protein accession | YP_001743251 |
Protein GI | 170681147 |
COG category | [R] General function prediction only |
COG ID | [COG4653] Predicted phage phi-C31 gp36 major capsid-like protein |
TIGRFAM ID | [TIGR01554] phage major capsid protein, HK97 family |
|
|
Plasmid Coverage information |
Num covering plasmid clones | 23 |
Plasmid unclonability p-value | 1 |
Plasmid hitchhiking | No |
Plasmid clonability | normal |
| |
Fosmid Coverage information |
Num covering fosmid clones | 16 |
Fosmid unclonability p-value | 0.00000012399 |
Fosmid Hitchhiker | Yes |
Fosmid clonability | hitchhiker |
| |
Sequence |
Gene sequence | ATGGCGGTTG ATATTAAAGA TGTCGAACAG GTCGCGCAGG AGCTGCAGCA GAAGTTTGAC GACTTCAAAG CAAAGAACGA CAAGCGCGTG GATGCGATTG AGCAGGAAAA AGGCAAACTT GCCGGGCAGG TGGAAACCCT GAACGGGAAA CTCAGCGAGC TGGAAAACCT CAAAAGCGAT CTTGAAAAAG AGCTGCTTGA GCTGAAACGT CCGGCAGGTG GTGCGCAAAA TAAACTGGCC ACCGAGCATA AAGAAGCGTT TGTGGGCTTC CTGCGTAAAG GCCGTGAAGA TGGTCTGCGC GATCTGGAGC GCAAGGCATT ACAGGTGGGC ACCGATGAAG ACGGCGGCTA TGCCGTGCCG GAAGCACTGG ATCGCAACAT TCTCACCCTG CTGAAAGATG AAGTGGTGAT GCGCCAGGAA GCCACGGTGA TCAGCGTTGG TGGTTCCGAC TACAAAAAAC TGGTGAATCT GGGCGGCACG GCTTCCGGAT GGGTTGGCGA GACTGACGCG CGCTCCCAGA CTGCCACCTC AAAACTGGGC CTGATTGAAC CTTTCATGGG GGAAATCTAC GGTAACCCGC AGGCCACCCA GAAAATGCTG GATGATGCCT TTTTCAACGT GGAAGCATGG ATCAACAGCG AGCTGGCAAC CGAATTTGCC GAACAGGAAG AAATTGCCTT TACCACCGGC GATGGTACCA AGAAGCCGAA AGGGTTCCTG GCGTATGAGT CCACGGATGA AACAGACAAG GTCCGGGCGT TCGGCAAACT TCAGCATATT GTATCCGGCG AAGCGACGGC GGTGACCGCA GATGCCATTA TCAAACTGAT TTACACGCTG CGTAAGGCAC ACCGCACTGG CGCGAAGTTC ATGATGAACA ACAACAGTCT GTTTGCCATC CGTCTGCTGA AAGACAGTGA GGGTAACTAT CTGTGGCGTC CGGGGCTGGA GCTGGGGCAG CCGTCCTCTC TGGCGGGTTA CGCTATCGCT GAAAACGAAC AGATGCCGGA TATTGCCGCT GATGCGAAAG CCATTGCATT TGGTAACTTC AAACGGGGTT ACACCATCGT TGACCGTATC GGTACCCGCA TTCTGCGTGA CCCGTACACC AATAAACCGT TTGTCGGTTT TTATACCACC AAGCGCACCG GCGGCATGCT GGTCGATTCG CAGGCCATCA AACTGCTGAA GATTGCAGCG GCGTAA
|
Protein sequence | MAVDIKDVEQ VAQELQQKFD DFKAKNDKRV DAIEQEKGKL AGQVETLNGK LSELENLKSD LEKELLELKR PAGGAQNKLA TEHKEAFVGF LRKGREDGLR DLERKALQVG TDEDGGYAVP EALDRNILTL LKDEVVMRQE ATVISVGGSD YKKLVNLGGT ASGWVGETDA RSQTATSKLG LIEPFMGEIY GNPQATQKML DDAFFNVEAW INSELATEFA EQEEIAFTTG DGTKKPKGFL AYESTDETDK VRAFGKLQHI VSGEATAVTA DAIIKLIYTL RKAHRTGAKF MMNNNSLFAI RLLKDSEGNY LWRPGLELGQ PSSLAGYAIA ENEQMPDIAA DAKAIAFGNF KRGYTIVDRI GTRILRDPYT NKPFVGFYTT KRTGGMLVDS QAIKLLKIAA A
|
| |