Gene EcSMS35_3370 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagEcSMS35_3370 
SymbolebgA 
ID6146970 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameEscherichia coli SMS-3-5 
KingdomBacteria 
Replicon accessionNC_010498 
Strand
Start bp3448624 
End bp3451716 
Gene Length3093 bp 
Protein Length1030 aa 
Translation table11 
GC content54% 
IMG OID641618199 
Productcryptic beta-D-galactosidase subunit alpha 
Protein accessionYP_001745348 
Protein GI170680616 
COG category[G] Carbohydrate transport and metabolism 
COG ID[COG3250] Beta-galactosidase/beta-glucuronidase 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clones22 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones59 
Fosmid unclonability p-value
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGAATCGCT GGGAAAACAT TCAGCTCACC CACGAAAACC GACTTGCGCC GCGTGCGTAC 
TTTTTTTCAT ATGATTCTGT TGCGCAGGCG CGTACTTTTG CCCGTGAAAC CAGCAGCCTG
TTTCTGTCCT TAAGCGGTCA GTGGAATTTC CACTTTTTTG ACCATCCGCT GCAAGTACCA
GAAGCCTTCA CCTCTGAGTT AATGGCTGAC TGGGGGCATA TTACCGTCCC CGCCATGTGG
CAAATGGAAG GTCACGGCAA ACTGCAATAT ACCGACGAAG GTTTTCCATT CCCCATCGAT
GTGCCGTTTG TTCCCAGCGA TAACCCAACC GGTGCCTATC AACGTATTTT CACCCTCAGC
GAAGGCTGGC AGGGTAAACA GACGCTGATC AAATTTGACG GTGTCGAAAC CTATTTTGAA
GTCTACGTTA ACGGTCAGTA TGTGGGGTTC AGCAAGGGCA GTCGCCTGAC CGCAGAGTTT
GACATCAGCG CGATGGTTAA AACCGGCGAC AACCTGTTGT GTGTGCGCGT GATGCAATGG
GCGGACTCTA CCTACGTGGA AGACCAGGAT ATGTGGTGGT CGGCGGGGAT TTTCCGCGAT
GTTTATCTGA TCGGAAAACA ACTCACGCAT ATTAACGATT TCACCGTTCG TACCGACTTT
GACGAAGCCT ATTGCGATGC CACGCTTTCC TGCGAAGTGG TGCTGGAAAA TCTCGCCGCC
TCCCCTGTCG TCACGACGCT GGAATATACC CTGTTTGATG GGGAACGCCT GGTGCACAGC
AGCGCCATTG ATCATTTGGC AATTGAAAAA CTGACCAGCG CCAGCTTTGC TTTTACTGTC
GAACAGCCCC AGCAATGGTC AGCAGAATCC CCTTATCTTT ACCATCTGGT CATGACGCTG
AAAGACGCCG ACGGCAACGT TCTGGAAGTG GTGCCACAAC GCGTTGGCTT CCGTGATATC
AAAGTGCGCG ACGGTCTGTT CTGGATCAAT AACCGTTATG TGATGCTGCA TGGCGTCAAC
CGCCACGACA ATGATCATCG CAAAGGCCGC GCCGTTGGGA TGGATCGCGT CGAGAAAGAT
CTCCAGTTGA TGAAGCAGCA CAACATCAAT TCCGTGCGTA CCGCTCACTA CCCGAACGAT
CCTCGTTTTT ACGAACTGTG TGATATCTAC GGCCTGTTTG TGATGGCGGA AACCGACGTC
GAATCGCACG GCTTTGCTAA CGTCGGCGAT ATCAGCCGCA TTACCGACGA TCCGCAGTGG
GAAAACGTCT ACGTCGAGCG CATTGTTCGC CATATTCACG CGCAGAAAAA CCATCCGTCG
ATCATCATCT GGTCGCTGGG CAATGAATCC GGCTATGGCT GTAACATCCG CGCGATGTAC
CATGCGGCGA AGGCGCTGGA TGACACGCGA CTGGTGCATT ACGAAGAAGA TCGCGATGCT
GAAGTGGTCG ATATTATTTC CACCATGTAC ACCCGCGTGC CGCTGATGAA TGAGTTTGGT
GAATACCCGC ATCCGAAGCC GCGCATCATC TGTGAATATG CTCATGCGAT GGGGAACGGA
CCAGGTGGGC TGACGGAGTA CCAGAACGTC TTCTACAAGC ACGATTGCAT TCAGGGACAT
TATGTCTGGG AATGGTGCGA CCACGGGATC CAGGCGCAGG ATGACAATGG CAACGTCTGG
TATAAATTCG GCGGCGACTA CGGCGACTAT CCCAACAACT ATAACTTCTG TCTTGATGGT
TTGATCTATT CCGATCAGAC GCCGGGACCA GGCCTGAAAG AGTACAAACA GGTTATTGCG
CCGGTAAAAA TCCACGCGCT GGATCTGACT CGCGGCGAGT TGAAAGTCGA AAATAAACTA
TGGTTTACCA CGCTTGATGA CTACACCCTG CACGCAGAGG TGCGCGCCGA AGGTGAAACA
CTCGCGACGC AGCAGATTAA ACTGCGCGAC GTTGCGCCGA ACAGCGAAGC CCCCTTGCAG
ATCACGCTGC CGCAGCTGGA CGCCCGCGAA GCGTTCCTCA ACATTACGGT GACCAAAGAT
TCCCGCACCC GCTACAGCGA AGCCGGGCAT TCTATCGCCA CTTATCAGTT CCCGCTGAAG
GAAAACACCG CGCAGCCAGT GCCTTTCGCA CCAAATAATG CGCGTCCGCT GACGCTGGAA
GACGATCGTT TGAGCTGCAC CGTTCGCGGC TACAACTTCG CGATCACCTT CTCAAAAATG
AGTGGCAAAC CGACATCCTG GCAGGTGAAT GGCGAGTCGC TGCTGACCCG CGAGCCAAAG
ATCAACTTCT TCAAGCCAAT GATCGACAAC CACAAGCAGG AGTACGAAGG ACTGTGGCAA
CCGAATCATT TGCAGATCAT GCAGGAACAT CTGCGCGACT TTGCCGTTGA GCAAAGTGAT
GATGAAGTGT TGATCATCAG CCGCACGGTT ATTGCCCCAC CGGTGTTTGA CTTCGGGATG
CGCTGTACCT ACATCTGGCG CATCGCTGCC GATGGTCAGG TTAATGTGGC GCTTTCCGGC
GAACGTTACG GCGACTATCC GCACATCATT CCGTGCATCG GTTTCACCAT GGGGATTAAC
GGCGAATACG ACCAGGTAGC ATATTACGGA CGTGGACCGG GCGAAAACTA CGCCGACAGC
CAGCAGGCTA ACATCATCGA TATCTGGCGC AGCACCGTCG ATGCCATGTT CGAGAACTAT
CCCTTCCCGC AGAACAACGG CAACCGTCAG CATGTCCGCT GGACGGCACT GACTAACCGC
CACGGCAACG GTCTGCTGGT GGTTCCGCAG CGCCCCATTA ACTTCAGCGC CTGGCGCTAT
ACCCAGGAAA ACATCCACGC TGCCCAGCAC TGTAACGAGC TTCAGCGCAG TGATGACATC
ACTCTGAATC TCGACCACCA GCTGCTTGGC CTCGGCTCCA ACTCCTGGGG CAGCGAGGTG
CTGGACTCCT GGCGCGTCTG GTTCCGTGAC TTCAGCTACG GCTTTACGTT GCTGCCGGTT
TCTGGCGGAG AAGCTACCGC GCAAAGCCTG GCGTCGTATG AGTTCGGCGC AGGGTTCTTT
TCCACGAATT TGCACAGCGA GAATAAGCAA TGA
 
Protein sequence
MNRWENIQLT HENRLAPRAY FFSYDSVAQA RTFARETSSL FLSLSGQWNF HFFDHPLQVP 
EAFTSELMAD WGHITVPAMW QMEGHGKLQY TDEGFPFPID VPFVPSDNPT GAYQRIFTLS
EGWQGKQTLI KFDGVETYFE VYVNGQYVGF SKGSRLTAEF DISAMVKTGD NLLCVRVMQW
ADSTYVEDQD MWWSAGIFRD VYLIGKQLTH INDFTVRTDF DEAYCDATLS CEVVLENLAA
SPVVTTLEYT LFDGERLVHS SAIDHLAIEK LTSASFAFTV EQPQQWSAES PYLYHLVMTL
KDADGNVLEV VPQRVGFRDI KVRDGLFWIN NRYVMLHGVN RHDNDHRKGR AVGMDRVEKD
LQLMKQHNIN SVRTAHYPND PRFYELCDIY GLFVMAETDV ESHGFANVGD ISRITDDPQW
ENVYVERIVR HIHAQKNHPS IIIWSLGNES GYGCNIRAMY HAAKALDDTR LVHYEEDRDA
EVVDIISTMY TRVPLMNEFG EYPHPKPRII CEYAHAMGNG PGGLTEYQNV FYKHDCIQGH
YVWEWCDHGI QAQDDNGNVW YKFGGDYGDY PNNYNFCLDG LIYSDQTPGP GLKEYKQVIA
PVKIHALDLT RGELKVENKL WFTTLDDYTL HAEVRAEGET LATQQIKLRD VAPNSEAPLQ
ITLPQLDARE AFLNITVTKD SRTRYSEAGH SIATYQFPLK ENTAQPVPFA PNNARPLTLE
DDRLSCTVRG YNFAITFSKM SGKPTSWQVN GESLLTREPK INFFKPMIDN HKQEYEGLWQ
PNHLQIMQEH LRDFAVEQSD DEVLIISRTV IAPPVFDFGM RCTYIWRIAA DGQVNVALSG
ERYGDYPHII PCIGFTMGIN GEYDQVAYYG RGPGENYADS QQANIIDIWR STVDAMFENY
PFPQNNGNRQ HVRWTALTNR HGNGLLVVPQ RPINFSAWRY TQENIHAAQH CNELQRSDDI
TLNLDHQLLG LGSNSWGSEV LDSWRVWFRD FSYGFTLLPV SGGEATAQSL ASYEFGAGFF
STNLHSENKQ