Gene EcSMS35_3016 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagEcSMS35_3016 
SymbolguaD 
ID6145996 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameEscherichia coli SMS-3-5 
KingdomBacteria 
Replicon accessionNC_010498 
Strand
Start bp3104570 
End bp3105886 
Gene Length1317 bp 
Protein Length438 aa 
Translation table11 
GC content49% 
IMG OID641617885 
Productguanine deaminase 
Protein accessionYP_001745036 
Protein GI170682427 
COG category[F] Nucleotide transport and metabolism
[R] General function prediction only 
COG ID[COG0402] Cytosine deaminase and related metal-dependent hydrolases 
TIGRFAM ID[TIGR02967] guanine deaminase 


Plasmid Coverage information

Num covering plasmid clones29 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones64 
Fosmid unclonability p-value
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGTCAGGAG AACACACGTT AAAAGCGGTA CGAGGCAGTT TTATTGATGT CACCCGTACA 
GTCGATAACC CAGAGGAGAT TGCCTCTGCG CTGCGGTTTA TTGAGGATGG TTTATTACTC
ATTAAACAGG GAAAAGTGGA ATGGTTTGGC GAATGGGAAG ACGGAAAGCA TCAAATTCCT
GACACTATTC GCGTGCGCGA CTATCGCGGC AAACTGATAG TACCGGGCTT TGTCGATACA
CATATCCATT ATCCGCAAAG TGAAATGGTG GGGGCCTATG GTGAACAATT GCTGGAGTGG
TTGAATAAAC ACACCTTCCC TACTGAACGT CGTTATGAGG ATTTAGAGTA CGCCCGCGAA
ATGTCGGCGT TCTTCATCAA GCAGCTTTTA CGTAACGGAA CCACCACGGC TCTGGTGTTT
GGCACTGTTC ATCCGCAATC CGTTGATGCG CTGTTTGAAG CCGCCAGCCA TATCAATATG
CGTATGATTG CCGGTAAGGT GATGATGGAC CGCAACGCAC CGGATTATCT GCTCGACACT
GCCGAAAGCA GCTATCACCA AAGCAAAGAG CTGATCGAAC GCTGGCACAA AAATGGTCGT
CTGTTGTATG CGATTACGCC ACGCTTCGCC CCTACCTCAT CTCCTGAACA GATGGCGATG
GCGAAACGCC TGAAAGAAGA ATATCCGGAT ACGTGGGTAC ATACCCATCT CTGTGAAAAC
AAAGATGAAA TTGCCTGGGT GAAAGAACTT TATCCTGACC ATGATGGCTA TCTGGATGTT
TACCATCAGT ACGGCCTGAC CGGTAAAAAC TGTGTCTTTG CTCACTGCGT CCATCTCGAA
GAAAAAGAGT GGGATCGTCT CAGCGAAACC AAATCCAGCA TTGCTTTCTG TCCGACCTCC
AACCTTTACC TCGGCAGCGG CTTATTCAAC TTGAAAAAAG CATGGCAGAA GAAAGTTAAA
GTGGGCATGG GAACGGATAT CGGTGCCGGA ACAACTTTCA ACATGCTGCA AACGCTGAAC
GAAGCCTACA AAGTATTGCA ATTACAAGGC TATCGCCTCT CGGCATATGA AGCGTTTTAC
CTGGCCACGC TCGGCGGAGC GAAATCTCTG GGCCTTGACG ATTTAATTGG CAACTTTTTA
CCTGGCAAAG AGGCTGATTT CGTGGTGATG GAACCCACCG CCACTCCGCT ACAGCAGCTG
CGCTATGACA ACTCTGTTTC TTTAGTCGAC AAATTGTTCG TGATGATGAC GTTGGGCGAT
GACCGTTCGA TCTACCGCAC CTACGTTGAT GGTCGTCTGG TGTACGAACG CAACTAA
 
Protein sequence
MSGEHTLKAV RGSFIDVTRT VDNPEEIASA LRFIEDGLLL IKQGKVEWFG EWEDGKHQIP 
DTIRVRDYRG KLIVPGFVDT HIHYPQSEMV GAYGEQLLEW LNKHTFPTER RYEDLEYARE
MSAFFIKQLL RNGTTTALVF GTVHPQSVDA LFEAASHINM RMIAGKVMMD RNAPDYLLDT
AESSYHQSKE LIERWHKNGR LLYAITPRFA PTSSPEQMAM AKRLKEEYPD TWVHTHLCEN
KDEIAWVKEL YPDHDGYLDV YHQYGLTGKN CVFAHCVHLE EKEWDRLSET KSSIAFCPTS
NLYLGSGLFN LKKAWQKKVK VGMGTDIGAG TTFNMLQTLN EAYKVLQLQG YRLSAYEAFY
LATLGGAKSL GLDDLIGNFL PGKEADFVVM EPTATPLQQL RYDNSVSLVD KLFVMMTLGD
DRSIYRTYVD GRLVYERN