Gene EcSMS35_3012 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagEcSMS35_3012 
SymbolssnA 
ID6145572 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameEscherichia coli SMS-3-5 
KingdomBacteria 
Replicon accessionNC_010498 
Strand
Start bp3097962 
End bp3099290 
Gene Length1329 bp 
Protein Length442 aa 
Translation table11 
GC content53% 
IMG OID641617881 
Productputative chlorohydrolase/aminohydrolase 
Protein accessionYP_001745032 
Protein GI170680729 
COG category[F] Nucleotide transport and metabolism
[R] General function prediction only 
COG ID[COG0402] Cytosine deaminase and related metal-dependent hydrolases 
TIGRFAM ID[TIGR03314] putative selenium metabolism protein SsnA 


Plasmid Coverage information

Num covering plasmid clones22 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones58 
Fosmid unclonability p-value
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGTTGATTC TGAAGAATGT CACCGCAGTG CAGTTACACC CGGCGAAAGT GCAGGAAGGC 
GTTGATATCG CCATCGAAAA TGATGTGATT GTCGCTATCG GCGATGCCCT GACGCAACGC
TACCCCGATG CCAGCTACAA AGAGATGCAT GGTCGGATTG TGATGCCAGG AATTGTCTGC
TCGCACAACC ATTTTTACTC GGGGCTGTCC CGCGGAATTA TGGCAAACAT CGCCCCCTGC
CCGGATTTCA TCTCAACGCT GAAAAATCTC TGGTGGCGAC TCGATCGCAC CCTTGATGAA
GAGTCGCTCT ATTACAGCGG ACTGATTTGT TCCCTGGAAG CGATTAAGGG CGGATGTACA
TCGGTTATCG ATCACCATGC CTCTCCGGCG TATATCGACG GGTCGCTCTC CACATTGCGC
AACGCATTTT TAAAAGTTGG CCTGCGCGCG ATGACCTGTT TTGAAACTAC TGACCGTAAC
AACGGTATCA AAGAGTTGCA GGAAGGTGTA GAAGAAAACA TCCGCTTCGC CCGTCAGATT
GATGAGGCGA AGAAAGCAGC AACCGAACCG TATCTGGTGG AAGCACATAT CGGCGCTCAC
GCGCCGTTTA CCGTGCCGGA TGCCGGTCTG GAGATGCTGC GTGAAGCCGT GAAAGCCACA
GGTCGTGGTT TGCACATTCA CGCTGCGGAA GACCTTTATG ACGTTTCCTA CAGTCACCAC
TGGTACGGCA AAGACCTGCT GGCACGACTG GCGCAATTCG ATCTCATCGA CAGCAAAACG
CTGGTCGCTC ATGGGCTGTA CTTGTCGAAA GATGACATCG CCCTACTCAA TCAGCGCGAT
GCGTTCCTGG TGCATAACGC CCGTTCAAAC ATGAACAACC ATGTCGGCTA CAACCATCAC
CTTAGCGACA TCCGCAATCT GGCGTTGGGA ACTGACGGCA TTGGTTCGGA CATGTTTGAA
GAGATGAAAT TTGCCTTCTT TAAACATCGC GATGCGGGTG GTCCGCTGTG GCCTGACAGT
TTTGCCAAAG CACTGGCTAA CGGCAACGAA CTGATGAGCC GCAACTTTGG CGCGAAATTT
GGCCTTCTGG AAGCCGGTTA CAAAGCCGAT TTAACCATTT GCGATTACAA CTCGCCGACA
CCGCTGCTGG CAGACAATAT CGCCGGGCAT ATCGCTTTCG GTATGGGCTC AGGCAGCGTT
CATAGCGTGA TGGTCAATGG CGTGATGGTC TATGAAGACC GTCAGTTTAA CTTCGATTGC
GATTCCATTT ATGCGCAAGC CAGAAAAGCC GCTGCCAGTA TGTGGCGTCG GATGGATGCG
CTGGCATAA
 
Protein sequence
MLILKNVTAV QLHPAKVQEG VDIAIENDVI VAIGDALTQR YPDASYKEMH GRIVMPGIVC 
SHNHFYSGLS RGIMANIAPC PDFISTLKNL WWRLDRTLDE ESLYYSGLIC SLEAIKGGCT
SVIDHHASPA YIDGSLSTLR NAFLKVGLRA MTCFETTDRN NGIKELQEGV EENIRFARQI
DEAKKAATEP YLVEAHIGAH APFTVPDAGL EMLREAVKAT GRGLHIHAAE DLYDVSYSHH
WYGKDLLARL AQFDLIDSKT LVAHGLYLSK DDIALLNQRD AFLVHNARSN MNNHVGYNHH
LSDIRNLALG TDGIGSDMFE EMKFAFFKHR DAGGPLWPDS FAKALANGNE LMSRNFGAKF
GLLEAGYKAD LTICDYNSPT PLLADNIAGH IAFGMGSGSV HSVMVNGVMV YEDRQFNFDC
DSIYAQARKA AASMWRRMDA LA