Gene EcSMS35_4585 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagEcSMS35_4585 
SymbolmelA 
ID6145633 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameEscherichia coli SMS-3-5 
KingdomBacteria 
Replicon accessionNC_010498 
Strand
Start bp4686874 
End bp4688229 
Gene Length1356 bp 
Protein Length451 aa 
Translation table11 
GC content52% 
IMG OID641619401 
Productalpha-galactosidase 
Protein accessionYP_001746513 
Protein GI170684090 
COG category[G] Carbohydrate transport and metabolism 
COG ID[COG1486] Alpha-galactosidases/6-phospho-beta-glucosidases, family 4 of glycosyl hydrolases 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clones23 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones56 
Fosmid unclonability p-value
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGATGTCTG CACCTAAAAT TACATTTATC GGCGCTGGTT CGACGATTTT CGTTAAAAAT 
ATTCTTGGTG ATGTGTTCCA TCGTGAGGCG CTGAAAACGG CGCATATTGC CCTGATGGAT
ATCGATCCCA CCCGCCTGGA AGAGTCGCAC ATTGTGGTGC GTAAGCTGAT GGATTCAGCA
GGGGCCAGCG GCAAAATCAC CTGCCACACC CAACAGAAAG AAGCCTTACA GGATGCCGAT
TTTGTGGTGG TGGCATTTCA GATTGGCGGT TATGAACCTT GCACGGTGAC TGATTTCGAG
GTCTGTAAGC GGCATGGTCT GGAACAAACC ATTGCCGATA CGTTGGGGCC GGGCGGTATT
ATGCGCGCGC TACGTACCAT TCCGCATCTG TGGCAAATTT GCGAGGACAT GACGGAAGTC
TGCCCCGATG CCACCATGCT CAACTACGTT AACCCAATGG CGATGAATAC CTGGGCGATG
TATGCCCGCT ATCCGCATAT CAAACAGGTA GGGCTGTGCC ATTCGGTGCA GGGAACGGCG
GAAGAGCTGG CGCGTGACCT CAATATCGAC CCGGCTACGC TGCGTTACCG TTGCGCAGGT
ATCAACCATA TGGCGTTTTA CCTGGAACTG GAGCGCAAAA CCGCCGACGG CAGTTACGTG
AATCTCTACC CGGAACTGCT GGCAGCGTAT GACGCAGGGC AGGCACCGAA GCCGAATATT
CATGGCAATA CTCGCTGCCA GAATATTGTG CGTTATGAAA TGTTCAAAAA GCTGGGCTAC
TTCGTCACGG AATCGTCAGA ACATTTTGCT GAATACACAC CGTGGTTTAT TAAGCCTGGT
CGTGAGGATT TGATTGAGCG TTATAAAGTA CCGCTGGATG AGTACCCGAA ACGCTGCGTC
GAACAATTGG CAAACTGGCA TAAAGAGCTG GAGGAGTATA AGAACGCCTC CCGGATTGAT
ATTAAACCGT CACGGGAATA TGCCAGCACA ATCATGAACG CTATCTGGAC TGGCGAGCCG
AGTGTGATTT ACGGCAACGT CCGTAACGAT GGATTGATTG ATAACCTGCC ACAAGGATGT
TGCGTGGAAG TAGCCTGTCT GGTTGATGCT AATGGCATTC AGCCGACCAA AGTCGGTACG
CTACCTTCGC ATCTGGCCGC CCTGATGCAA ACCAACATCA ACGTACAGAC GCTGCTGACC
GAAGCTATTC TTACGGAAAA TCGCGACCGT GTTTACCACG CCGCGATGAT GGACCCGCAT
ACTGCCGCCG TGCTGGGCAT TGACGAAATA TATGCTCTTG TTGACGACCT GATTGCCGCC
CACGGCGACT GGCTGCCAGG CTGGTTGCAC CGTTAA
 
Protein sequence
MMSAPKITFI GAGSTIFVKN ILGDVFHREA LKTAHIALMD IDPTRLEESH IVVRKLMDSA 
GASGKITCHT QQKEALQDAD FVVVAFQIGG YEPCTVTDFE VCKRHGLEQT IADTLGPGGI
MRALRTIPHL WQICEDMTEV CPDATMLNYV NPMAMNTWAM YARYPHIKQV GLCHSVQGTA
EELARDLNID PATLRYRCAG INHMAFYLEL ERKTADGSYV NLYPELLAAY DAGQAPKPNI
HGNTRCQNIV RYEMFKKLGY FVTESSEHFA EYTPWFIKPG REDLIERYKV PLDEYPKRCV
EQLANWHKEL EEYKNASRID IKPSREYAST IMNAIWTGEP SVIYGNVRND GLIDNLPQGC
CVEVACLVDA NGIQPTKVGT LPSHLAALMQ TNINVQTLLT EAILTENRDR VYHAAMMDPH
TAAVLGIDEI YALVDDLIAA HGDWLPGWLH R