Gene ECH74115_2040 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagECH74115_2040 
Symbol 
ID6968741 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameEscherichia coli O157:H7 str. EC4115 
KingdomBacteria 
Replicon accessionNC_011353 
Strand
Start bp1937093 
End bp1939096 
Gene Length2004 bp 
Protein Length667 aa 
Translation table11 
GC content51% 
IMG OID643385952 
Productpeptidase, U32 family 
Protein accessionYP_002270441 
Protein GI209395986 
COG category[O] Posttranslational modification, protein turnover, chaperones 
COG ID[COG0826] Collagenase and related proteases 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clones
Plasmid unclonability p-value0.0020805 
Plasmid hitchhikingYes 
Plasmid clonabilityhitchhiker 
 

Fosmid Coverage information

Num covering fosmid clones
Fosmid unclonability p-value3.90569e-18 
Fosmid HitchhikerYes 
Fosmid clonabilityhitchhiker 
 

Sequence

Gene sequence
GTGGCTAAAA TAGCCGCCAT TTTTCAGCTA CTGGATAAGA ATGTGACCGT ATCTTCTCAT 
CGACTTGAAC TGTTAAGCCC GGCACGCGAT GCCGCCATTG CCCGCGAAGC TATTTTGCAC
GGTGCCGATG CTGTTTATAT CGGCGGCCCT GGTTTTGGTG CCCGTCATAA TGCCAGTAAT
AGCCTGAAAG ATATTGCCGA GCTGGTGCCG TTTGCCCATC GTTATGGTGC AAAAATTTTC
GTCACGCTTA ACACCATTTT GCATGATGAT GAGCTGGAAC CCGCGCAACG GCTGATTACT
GACCTCTACC AGACCGGTGT CGATGCGCTG ATTGTTCAGG ATATGGGGAT TCTGGAACTT
GATATTCCGC CGATTGAACT GCACGCCAGT ACGCAGTGCG ACATTCGTAC AGTAGAAAAA
GCGAAGTTCC TCTCTGATGT TGGCTTCACG CAGATTGTGC TGGCGCGAGA GCTGAATCTT
GATCAGATCC GCGCGATTCA CCAGGCCACG GACGCGACCA TTGAATTCTT TATTCATGGC
GCACTGTGCG TGGCCTATTC GGGTCAGTGC TACATTTCTC ATGCGCAAAC AGGGCGTAGC
GCCAACCGTG GCGATTGCTC GCAAGCGTGC CGTTTGCCAT ACACATTGAA AGACGATCAG
GGGCGGGTGG TTTCCTATGA AAAACATCTG CTGTCGATGA AAGATAACGA TCAGACTGCC
AACCTCGGCG CGCTGATTGA TGCTGGCGTA CGTTCCTTCA AGATTGAAGG GCGTTACAAA
GATATGAGCT ACGTGAAGAA TATCACCGCC CATTATCGCC AGATGCTTGA TGCCATTATT
GAAGAACGTG GCGATCTGGC GCGCGCTTCA TCAGGTCGTA CTGAACATTT CTTTGTTCCA
TCGACGGAAA AGACTTTCCA CCGTGGTAGC ACAGATTATT TTGTGAATGC CCGTAAAGGC
GATATTGGCG CGTTCGATTC GCCGAAATTT ATCGGCCTGC CGGTAGGTGA AGTAGTGAAA
GTGGCGAAAG ATCATCTCGA TGTTGCCGTT ACCGAGCCAC TGGCAAATGG CGATGGCCTG
AACGTGTTGA TTAAACGTGA AGTCGTCGGT TTTCGTGCCA ATACGGTCGA GAAAACCGGA
GAAAATCAGT ACCGCGTCTG GCCCAATGAA ATGCCAGCAG ATTTGCACAA AATTCGCCCG
CATCACCCAC TAAACCGTAA TCTTGATCAT AACTGGCAGC AGGCACTGAC AAAAACCTCT
AGCGAACGTC GGGTGGCGGT AGACATTGAA CTGGGCGGTT GGCAGGAACA ATTGATTCTG
ACTCTCACCA GTGAAGAGGG GGTCAGCATA ACGCATACGC TGGATGGGCA GTTCGACGAA
GCCAATAACG CAGAAAAAGC AATGAACAAT CTGAAGGATG GTCTGGCAAA ACTGGGGCAA
ACCCTCTATT ACGCCCGCGA TGTGCAAATT AATTTGCCGG GGGCGCTGTT TGTACCAAAC
AGTCTGTTAA ACCAGTTCCG CCGTGAAGCT GCCGACATGC TGGATGCTGC GCGTCTTGCC
AGTTACCAGC GCGGCAGCCG TAAACCGGTT GCTGATCCTG CGCCGGTTTA TCCGCAAACA
CATCTGAGTT TCCTCGCGAA CGTTTACAAC CAGAAAGCGC GTGAATTTTA TCATCGCTAT
GGTGTGCAGC TGATTGACGC GGCGTATGAG GCACATGAAG AGAAGGGCGA AGTCCCGGTG
ATGATCACCA AGCATTGTCT GCGCTTTGCC TTTAATCTGT GCCCGAAACA GGCGAAAGGC
AATATCAAAA GCTGGAAGGC GACGCCAATG CAACTGGTTA ACGGCGATGA AGTATTAACG
CTAAAGTTTG ATTGCCGCCC ATGCGAGATG CACGTCATTG GCAAAATCAA AAATCACATA
CTGAAAATGC CGTTACCGGG AAGCGTAGTG GCATCCGTAA GTCCGGATGA GCTGCTGAAA
ACATTGCCGA AGCGAAAAGG GTAA
 
Protein sequence
MAKIAAIFQL LDKNVTVSSH RLELLSPARD AAIAREAILH GADAVYIGGP GFGARHNASN 
SLKDIAELVP FAHRYGAKIF VTLNTILHDD ELEPAQRLIT DLYQTGVDAL IVQDMGILEL
DIPPIELHAS TQCDIRTVEK AKFLSDVGFT QIVLARELNL DQIRAIHQAT DATIEFFIHG
ALCVAYSGQC YISHAQTGRS ANRGDCSQAC RLPYTLKDDQ GRVVSYEKHL LSMKDNDQTA
NLGALIDAGV RSFKIEGRYK DMSYVKNITA HYRQMLDAII EERGDLARAS SGRTEHFFVP
STEKTFHRGS TDYFVNARKG DIGAFDSPKF IGLPVGEVVK VAKDHLDVAV TEPLANGDGL
NVLIKREVVG FRANTVEKTG ENQYRVWPNE MPADLHKIRP HHPLNRNLDH NWQQALTKTS
SERRVAVDIE LGGWQEQLIL TLTSEEGVSI THTLDGQFDE ANNAEKAMNN LKDGLAKLGQ
TLYYARDVQI NLPGALFVPN SLLNQFRREA ADMLDAARLA SYQRGSRKPV ADPAPVYPQT
HLSFLANVYN QKAREFYHRY GVQLIDAAYE AHEEKGEVPV MITKHCLRFA FNLCPKQAKG
NIKSWKATPM QLVNGDEVLT LKFDCRPCEM HVIGKIKNHI LKMPLPGSVV ASVSPDELLK
TLPKRKG