Gene EcSMS35_4143 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagEcSMS35_4143 
SymbolgppA 
ID6145004 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameEscherichia coli SMS-3-5 
KingdomBacteria 
Replicon accessionNC_010498 
Strand
Start bp4242392 
End bp4243876 
Gene Length1485 bp 
Protein Length494 aa 
Translation table11 
GC content53% 
IMG OID641618966 
Productguanosine pentaphosphate phosphohydrolase 
Protein accessionYP_001746098 
Protein GI170681890 
COG category[F] Nucleotide transport and metabolism
[P] Inorganic ion transport and metabolism 
COG ID[COG0248] Exopolyphosphatase 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clones14 
Plasmid unclonability p-value0.263273 
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones53 
Fosmid unclonability p-value
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGGGTTCCA CCTCGTCGCT GTATGCAGCC ATTGATCTCG GTTCGAATAG TTTTCATATG 
CTGGTTGTGC GCGAGGTGGC TGGAAGCATC CAGACGCTGA CGCGAATTAA ACGCAAAGTG
CGTCTGGCTG CTGGCCTGAA TAGCGAAAAT GCCCTGTCTA ATGAAGCAAT GGAGCGCGGT
TGGCAATGTC TGCGCCTGTT TGCTGAACGT CTGCAAGATA TCCCTCCCTC GCAAATTCGC
GTTGTCGCTA CGGCGACATT ACGACTAGCC GTCAATGCGG GTGATTTTAT CGCCAAAGCA
CAGGAAATCC TCGGTTGTCC GGTACAGGTG ATCAGCGGTG AAGAGGAAGC ACGTCTGATT
TATCAGGGCG TTGCTCACAC AACTGGCGGT GCCGATCAGC GCCTGGTGGT GGATATAGGC
GGTGCCAGTA CGGAACTGGT AACCGGCACG GGTGCACAAA CCACCTCGTT GTTCAGCCTG
TCGATGGGCT GCGTCACCTG GCTGGAACGC TATTTTGCCG ATCGTAATCT GGGGCAGGAA
AATTTTGATG CTGCAGAAAA AGCGGCACGC GAAGTGTTAC GTCCGGTTGC CGATGAATTA
CGGTATCACG GCTGGAAAGT GTGCGTTGGC GCTTCCGGCA CCGTGCAGGC GTTACAGGAA
ATCATGATGG CACAGGGGAT GGATGAACGC ATTACCCTGG AAAAGTTGCA GCAACTGAAA
CAACGCGCCA TTCATTGCGG TCGGCTGGAA GAGCTAGAGA TTGACGGGCT GACGCTGGAA
CGTGCGTTAG TGTTCCCGAG TGGTCTGGCG ATCCTGATAG CCATTTTTAC CGAACTGAAT
ATTCAGTGTA TGACCCTGGC GGGCGGTGCG CTGCGTGAAG GCCTGGTCTA CGGTATGTTA
CATCTTACCG TCGAGCAGGA TATTCGCAGC CGTACGCTGC GTAATATTCA GCGCCGCTTT
ATGATCGATA TTGATCAGGC ACAGCGCGTA GCCAAAGTTG CGGCTAACTT CTTCGATCAG
GTGGAAAATG AATGGCATCT TGAAGCAATA AGCCGCGATT TGCTCATCAG CGCCTGCCAG
CTTCATGAAA TCGGCCTGAG CGTTGACTTC AAACAAGCGC CGCAACACGC TGCTTATCTG
GTACGTAATC TGGATCTTCC CGGTTTTACC CCCGCACAGA AAAAACTGCT GGCGACGCTA
CTGCTCAACC AGACTAATCC GGTCGATCTC TCATCGCTGC ATCAGCAAAA TGCCGTACCA
CCGCGCGTCG CAGAACAACT CTGCCGTTTA CTCCGCCTGG CTATCATTTT TGCCAGCCGT
CGCCGTGACG ATCTCGTGCC AGAGATGACA TTACAGGCTA ACCATGAACT GTTGACCTTG
ACGCTTCCGC AAGGTTGGCT AACCCAACAT CCGCTGGGTA AAGAGATTAT TGATCAGGAA
AGCCAGTGGC AGAGCTATGT CCACTGGCCG CTGGAAGTGC ATTAA
 
Protein sequence
MGSTSSLYAA IDLGSNSFHM LVVREVAGSI QTLTRIKRKV RLAAGLNSEN ALSNEAMERG 
WQCLRLFAER LQDIPPSQIR VVATATLRLA VNAGDFIAKA QEILGCPVQV ISGEEEARLI
YQGVAHTTGG ADQRLVVDIG GASTELVTGT GAQTTSLFSL SMGCVTWLER YFADRNLGQE
NFDAAEKAAR EVLRPVADEL RYHGWKVCVG ASGTVQALQE IMMAQGMDER ITLEKLQQLK
QRAIHCGRLE ELEIDGLTLE RALVFPSGLA ILIAIFTELN IQCMTLAGGA LREGLVYGML
HLTVEQDIRS RTLRNIQRRF MIDIDQAQRV AKVAANFFDQ VENEWHLEAI SRDLLISACQ
LHEIGLSVDF KQAPQHAAYL VRNLDLPGFT PAQKKLLATL LLNQTNPVDL SSLHQQNAVP
PRVAEQLCRL LRLAIIFASR RRDDLVPEMT LQANHELLTL TLPQGWLTQH PLGKEIIDQE
SQWQSYVHWP LEVH