Gene EcHS_A0371 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagEcHS_A0371 
SymbolbetB 
ID5592816 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameEscherichia coli HS 
KingdomBacteria 
Replicon accessionNC_009800 
Strand
Start bp384601 
End bp386073 
Gene Length1473 bp 
Protein Length490 aa 
Translation table11 
GC content58% 
IMG OID640919556 
Productbetaine aldehyde dehydrogenase 
Protein accessionYP_001457142 
Protein GI157159824 
COG category[C] Energy production and conversion 
COG ID[COG1012] NAD-dependent aldehyde dehydrogenases 
TIGRFAM ID[TIGR01804] glycine betaine aldehyde dehydrogenase 


Plasmid Coverage information

Num covering plasmid clones48 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clonesn/a 
Fosmid unclonability p-valuen/a 
Fosmid Hitchhikern/a 
Fosmid clonabilityn/a 
 

Sequence

Gene sequence
ATGTCCCGAA TGGCAGAACA GCAGCTTTAT ATACATGGTG GTTATACCTC CGCCACCAGC 
GGTCGCACCT TCGAGACCAT TAACCCGGCC AACGGTAACG TGCTGGCGAC CGTGCAGGCC
GCCGGGCGTG AGGATGTCGA TCGCGCCGTG AAGAGTGCCC AACAGGGGCA AAAAATCTGG
GCGGCGATGA CCGCTATGGA GCGCTCGCGT ATTCTGCGTC GGGCCGTTGA TATTCTGCGT
GAACGCAATG ACGAACTCGC AAAACTGGAA ACCCTCGATA CCGGAAAAGC ATATTCGGAA
ACCTCAACCG TCGATATCGT TACCGGTGCG GACGTGCTGG AGTACTACGC CGGGCTGATC
CCGGCGCTGG AAGGCAGCCA GATCCCGTTG CGTGAGACGT CATTTGTTTA TACCCGCCGC
GAACCGCTGG GTGTGGTGGC GGGGATTGGC GCATGGAACT ACCCGATTCA GATTGCCCTG
TGGAAATCCG CCCCGGCGCT GGCGGCAGGC AACGCAATGA TTTTCAAACC GAGCGAAGTC
ACCCCGCTTA CCGCGTTAAA GCTGGCTGAA ATTTACAGCG AAGCGGGCCT GCCGGACGGC
GTATTTAACG TGTTGCCGGG GGTGGGCGCG GAGACCGGGC AATATCTGAC CGAGCATCCG
GGCATTGCCA AAGTGTCATT TACCGGCGGT GTCGCCAGCG GCAAAAAAGT GATGGCTAAC
TCGGCGGCCT CTTCCCTGAA AGAAGTGACC ATGGAACTGG GCGGTAAATC ACCGCTGATC
GTTTTCGATG ATGCGGATCT CGATCTCGCC GCCGATATCG CCATGATGGC GAACTTCTTC
AGCTCCGGTC AGGTGTGTAC CAATGGCACC CGCGTCTTCG TTCCGGCGAA ATGCAAAGCC
GCATTTGAAC AAAAGATTCT GGCGCGCGTT GAGCGCATTC GCGCGGGCGA CGTTTTCGAT
CCGCAAACCA ATTTCGGCCC GCTGGTCAGC TTCCCGCATC GCGATAACGT GCTGCGCTAT
ATCGCCAAAG GCCAAGAGGA AGGCGCGCGC GTACTGTGCG GCGGCGATGT ACTGAAAGGC
GATGGCTTCG ATAACGGCGC ATGGGTGGCA CCGACCGTGT TCACCGATTG CCGCGACGAT
ATGACCATTG TGCGTGAAGA GATCTTCGGG CCGGTGATGT CCATTCTGAC CTACGAGACG
GAAGACGAAG TCATTCGCCG CGCCAATGAT ACCGACTATG GTCTGGCGGC GGGTATCGTG
ACGGCGGACC TGAACCGCGC GCATCGCGTC ATTCATCAGC TGGAAGCGGG TATTTGCTGG
ATCAATACCT GGGGTGAATC CCCGGCAGAG ATGCCCGTTG GCGGCTACAA ACACTCCGGC
ATTGGTCGCG AGAACGGCGT GATGACGCTC CAGAGTTACA CCCAGGTGAA GTCCATCCAG
GTTGAGATGG CTAAATTCCA GTCCATATTC TAA
 
Protein sequence
MSRMAEQQLY IHGGYTSATS GRTFETINPA NGNVLATVQA AGREDVDRAV KSAQQGQKIW 
AAMTAMERSR ILRRAVDILR ERNDELAKLE TLDTGKAYSE TSTVDIVTGA DVLEYYAGLI
PALEGSQIPL RETSFVYTRR EPLGVVAGIG AWNYPIQIAL WKSAPALAAG NAMIFKPSEV
TPLTALKLAE IYSEAGLPDG VFNVLPGVGA ETGQYLTEHP GIAKVSFTGG VASGKKVMAN
SAASSLKEVT MELGGKSPLI VFDDADLDLA ADIAMMANFF SSGQVCTNGT RVFVPAKCKA
AFEQKILARV ERIRAGDVFD PQTNFGPLVS FPHRDNVLRY IAKGQEEGAR VLCGGDVLKG
DGFDNGAWVA PTVFTDCRDD MTIVREEIFG PVMSILTYET EDEVIRRAND TDYGLAAGIV
TADLNRAHRV IHQLEAGICW INTWGESPAE MPVGGYKHSG IGRENGVMTL QSYTQVKSIQ
VEMAKFQSIF