Gene Information Plasmid Coverage information Fosmid Coverage information Sequence |
Gene Information |
Locus tag | EcSMS35_3370 |
Symbol | ebgA |
ID | 6146970 |
Type | CDS |
Is gene spliced | No |
Is pseudo gene | No |
Organism name | Escherichia coli SMS-3-5 |
Kingdom | Bacteria |
Replicon accession | NC_010498 |
Strand | + |
Start bp | 3448624 |
End bp | 3451716 |
Gene Length | 3093 bp |
Protein Length | 1030 aa |
Translation table | 11 |
GC content | 54% |
IMG OID | 641618199 |
Product | cryptic beta-D-galactosidase subunit alpha |
Protein accession | YP_001745348 |
Protein GI | 170680616 |
COG category | [G] Carbohydrate transport and metabolism |
COG ID | [COG3250] Beta-galactosidase/beta-glucuronidase |
TIGRFAM ID | |
|
|
Plasmid Coverage information |
Num covering plasmid clones | 22 |
Plasmid unclonability p-value | 1 |
Plasmid hitchhiking | No |
Plasmid clonability | normal |
| |
Fosmid Coverage information |
Num covering fosmid clones | 59 |
Fosmid unclonability p-value | 1 |
Fosmid Hitchhiker | No |
Fosmid clonability | normal |
| |
Sequence |
Gene sequence | ATGAATCGCT GGGAAAACAT TCAGCTCACC CACGAAAACC GACTTGCGCC GCGTGCGTAC TTTTTTTCAT ATGATTCTGT TGCGCAGGCG CGTACTTTTG CCCGTGAAAC CAGCAGCCTG TTTCTGTCCT TAAGCGGTCA GTGGAATTTC CACTTTTTTG ACCATCCGCT GCAAGTACCA GAAGCCTTCA CCTCTGAGTT AATGGCTGAC TGGGGGCATA TTACCGTCCC CGCCATGTGG CAAATGGAAG GTCACGGCAA ACTGCAATAT ACCGACGAAG GTTTTCCATT CCCCATCGAT GTGCCGTTTG TTCCCAGCGA TAACCCAACC GGTGCCTATC AACGTATTTT CACCCTCAGC GAAGGCTGGC AGGGTAAACA GACGCTGATC AAATTTGACG GTGTCGAAAC CTATTTTGAA GTCTACGTTA ACGGTCAGTA TGTGGGGTTC AGCAAGGGCA GTCGCCTGAC CGCAGAGTTT GACATCAGCG CGATGGTTAA AACCGGCGAC AACCTGTTGT GTGTGCGCGT GATGCAATGG GCGGACTCTA CCTACGTGGA AGACCAGGAT ATGTGGTGGT CGGCGGGGAT TTTCCGCGAT GTTTATCTGA TCGGAAAACA ACTCACGCAT ATTAACGATT TCACCGTTCG TACCGACTTT GACGAAGCCT ATTGCGATGC CACGCTTTCC TGCGAAGTGG TGCTGGAAAA TCTCGCCGCC TCCCCTGTCG TCACGACGCT GGAATATACC CTGTTTGATG GGGAACGCCT GGTGCACAGC AGCGCCATTG ATCATTTGGC AATTGAAAAA CTGACCAGCG CCAGCTTTGC TTTTACTGTC GAACAGCCCC AGCAATGGTC AGCAGAATCC CCTTATCTTT ACCATCTGGT CATGACGCTG AAAGACGCCG ACGGCAACGT TCTGGAAGTG GTGCCACAAC GCGTTGGCTT CCGTGATATC AAAGTGCGCG ACGGTCTGTT CTGGATCAAT AACCGTTATG TGATGCTGCA TGGCGTCAAC CGCCACGACA ATGATCATCG CAAAGGCCGC GCCGTTGGGA TGGATCGCGT CGAGAAAGAT CTCCAGTTGA TGAAGCAGCA CAACATCAAT TCCGTGCGTA CCGCTCACTA CCCGAACGAT CCTCGTTTTT ACGAACTGTG TGATATCTAC GGCCTGTTTG TGATGGCGGA AACCGACGTC GAATCGCACG GCTTTGCTAA CGTCGGCGAT ATCAGCCGCA TTACCGACGA TCCGCAGTGG GAAAACGTCT ACGTCGAGCG CATTGTTCGC CATATTCACG CGCAGAAAAA CCATCCGTCG ATCATCATCT GGTCGCTGGG CAATGAATCC GGCTATGGCT GTAACATCCG CGCGATGTAC CATGCGGCGA AGGCGCTGGA TGACACGCGA CTGGTGCATT ACGAAGAAGA TCGCGATGCT GAAGTGGTCG ATATTATTTC CACCATGTAC ACCCGCGTGC CGCTGATGAA TGAGTTTGGT GAATACCCGC ATCCGAAGCC GCGCATCATC TGTGAATATG CTCATGCGAT GGGGAACGGA CCAGGTGGGC TGACGGAGTA CCAGAACGTC TTCTACAAGC ACGATTGCAT TCAGGGACAT TATGTCTGGG AATGGTGCGA CCACGGGATC CAGGCGCAGG ATGACAATGG CAACGTCTGG TATAAATTCG GCGGCGACTA CGGCGACTAT CCCAACAACT ATAACTTCTG TCTTGATGGT TTGATCTATT CCGATCAGAC GCCGGGACCA GGCCTGAAAG AGTACAAACA GGTTATTGCG CCGGTAAAAA TCCACGCGCT GGATCTGACT CGCGGCGAGT TGAAAGTCGA AAATAAACTA TGGTTTACCA CGCTTGATGA CTACACCCTG CACGCAGAGG TGCGCGCCGA AGGTGAAACA CTCGCGACGC AGCAGATTAA ACTGCGCGAC GTTGCGCCGA ACAGCGAAGC CCCCTTGCAG ATCACGCTGC CGCAGCTGGA CGCCCGCGAA GCGTTCCTCA ACATTACGGT GACCAAAGAT TCCCGCACCC GCTACAGCGA AGCCGGGCAT TCTATCGCCA CTTATCAGTT CCCGCTGAAG GAAAACACCG CGCAGCCAGT GCCTTTCGCA CCAAATAATG CGCGTCCGCT GACGCTGGAA GACGATCGTT TGAGCTGCAC CGTTCGCGGC TACAACTTCG CGATCACCTT CTCAAAAATG AGTGGCAAAC CGACATCCTG GCAGGTGAAT GGCGAGTCGC TGCTGACCCG CGAGCCAAAG ATCAACTTCT TCAAGCCAAT GATCGACAAC CACAAGCAGG AGTACGAAGG ACTGTGGCAA CCGAATCATT TGCAGATCAT GCAGGAACAT CTGCGCGACT TTGCCGTTGA GCAAAGTGAT GATGAAGTGT TGATCATCAG CCGCACGGTT ATTGCCCCAC CGGTGTTTGA CTTCGGGATG CGCTGTACCT ACATCTGGCG CATCGCTGCC GATGGTCAGG TTAATGTGGC GCTTTCCGGC GAACGTTACG GCGACTATCC GCACATCATT CCGTGCATCG GTTTCACCAT GGGGATTAAC GGCGAATACG ACCAGGTAGC ATATTACGGA CGTGGACCGG GCGAAAACTA CGCCGACAGC CAGCAGGCTA ACATCATCGA TATCTGGCGC AGCACCGTCG ATGCCATGTT CGAGAACTAT CCCTTCCCGC AGAACAACGG CAACCGTCAG CATGTCCGCT GGACGGCACT GACTAACCGC CACGGCAACG GTCTGCTGGT GGTTCCGCAG CGCCCCATTA ACTTCAGCGC CTGGCGCTAT ACCCAGGAAA ACATCCACGC TGCCCAGCAC TGTAACGAGC TTCAGCGCAG TGATGACATC ACTCTGAATC TCGACCACCA GCTGCTTGGC CTCGGCTCCA ACTCCTGGGG CAGCGAGGTG CTGGACTCCT GGCGCGTCTG GTTCCGTGAC TTCAGCTACG GCTTTACGTT GCTGCCGGTT TCTGGCGGAG AAGCTACCGC GCAAAGCCTG GCGTCGTATG AGTTCGGCGC AGGGTTCTTT TCCACGAATT TGCACAGCGA GAATAAGCAA TGA
|
Protein sequence | MNRWENIQLT HENRLAPRAY FFSYDSVAQA RTFARETSSL FLSLSGQWNF HFFDHPLQVP EAFTSELMAD WGHITVPAMW QMEGHGKLQY TDEGFPFPID VPFVPSDNPT GAYQRIFTLS EGWQGKQTLI KFDGVETYFE VYVNGQYVGF SKGSRLTAEF DISAMVKTGD NLLCVRVMQW ADSTYVEDQD MWWSAGIFRD VYLIGKQLTH INDFTVRTDF DEAYCDATLS CEVVLENLAA SPVVTTLEYT LFDGERLVHS SAIDHLAIEK LTSASFAFTV EQPQQWSAES PYLYHLVMTL KDADGNVLEV VPQRVGFRDI KVRDGLFWIN NRYVMLHGVN RHDNDHRKGR AVGMDRVEKD LQLMKQHNIN SVRTAHYPND PRFYELCDIY GLFVMAETDV ESHGFANVGD ISRITDDPQW ENVYVERIVR HIHAQKNHPS IIIWSLGNES GYGCNIRAMY HAAKALDDTR LVHYEEDRDA EVVDIISTMY TRVPLMNEFG EYPHPKPRII CEYAHAMGNG PGGLTEYQNV FYKHDCIQGH YVWEWCDHGI QAQDDNGNVW YKFGGDYGDY PNNYNFCLDG LIYSDQTPGP GLKEYKQVIA PVKIHALDLT RGELKVENKL WFTTLDDYTL HAEVRAEGET LATQQIKLRD VAPNSEAPLQ ITLPQLDARE AFLNITVTKD SRTRYSEAGH SIATYQFPLK ENTAQPVPFA PNNARPLTLE DDRLSCTVRG YNFAITFSKM SGKPTSWQVN GESLLTREPK INFFKPMIDN HKQEYEGLWQ PNHLQIMQEH LRDFAVEQSD DEVLIISRTV IAPPVFDFGM RCTYIWRIAA DGQVNVALSG ERYGDYPHII PCIGFTMGIN GEYDQVAYYG RGPGENYADS QQANIIDIWR STVDAMFENY PFPQNNGNRQ HVRWTALTNR HGNGLLVVPQ RPINFSAWRY TQENIHAAQH CNELQRSDDI TLNLDHQLLG LGSNSWGSEV LDSWRVWFRD FSYGFTLLPV SGGEATAQSL ASYEFGAGFF STNLHSENKQ
|
| |