Gene Moth_1688 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagMoth_1688 
Symbol 
ID3833288 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameMoorella thermoacetica ATCC 39073 
KingdomBacteria 
Replicon accessionNC_007644 
Strand
Start bp1726028 
End bp1727158 
Gene Length1131 bp 
Protein Length376 aa 
Translation table11 
GC content55% 
IMG OID637829613 
Productglycine betaine/L-proline transport ATP binding subunit 
Protein accessionYP_430533 
Protein GI83590524 
COG category[E] Amino acid transport and metabolism 
COG ID[COG1125] ABC-type proline/glycine betaine transport systems, ATPase components 
TIGRFAM ID[TIGR01186] glycine betaine/L-proline transport ATP binding subunit 


Plasmid Coverage information

Num covering plasmid clones22 
Plasmid unclonability p-value0.315313 
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones15 
Fosmid unclonability p-value0.421544 
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGCTTACCT TCGAACACGT CTCGAAAGTA TATGACGGCA ATCGAATAGC CGTAGCCGAT 
TTCAACCTGG AAGTCGAGGC CGGAGAATTT ATTGTGTTAA TTGGCCCCAG CGGTTGTGGC
AAGACCACCA CTTTAAAAAT GGTTAACCGT CTTATCGAAC CCACCTCCGG GGCCATCTAC
CTTAACGGCA AGGATATCCG GGAACAAAAT CCCGTGGCGT TACGGCGACA CATTGGCTAC
GTTATCCAGC AGATAGCTCT TTTTCCCAAC ATGACTATTG CTCAAAACGT GGATGTGGTA
CCCCGCCTGC TGGGATGGCC GGCAGAACGC CGCCGCCAGC GCGTTTGCGA ATTATTGGAA
CTGGTGGGTA TGGACCCTGA TGACTACGCT GACCGTTACC CTTCAGAGCT AAGCGGGGGG
CAGCAACAGC GTATCGGGGT GTTGCGTGCC CTGGCGGCAG AACCACCGCT TATCCTTATG
GATGAGCCTT TTGGTGCCCT TGACCCAATT ACGCGGGAAA ACCTGCAGGA AGAATTGAAG
GCCTTGCAGG CCAAGCTGCA TAAGACCATT CTCTTTGTTA CCCACGATAT GGACGAGGCA
CTGAAAATTG CTGATCGGAT TGTGGTAATG AAAGACGGCT ACATCGTCCA AGTCGCTGCG
CCTGAAGAAC TGTTGCGGCA CCCCGCCAAC GAGTTCGTGG CCTCGTTCAT CGGCAAAGAA
CGGTTGGCTC CTGGACTGGA ATTGCGCACC GTAGAACAGG TTATGATTGG TGAACCGGTG
ACGGTACGGC CCCATACGGG TGTTGCCGAA GGAGTTGCCA CCATGCGTCG TAAAAAGGTG
GATACGCTGC TGGTTACCGA TGAATCTGGC CGGCTGTTAG GCGCCGTTTC TATCGAGGAA
TTGAATCGCA ACTACCAGCG GGCTCACCAG GTGCAAGATT TGATGGCTCG TGACGTTCCT
GTAGTGTTCG AGGGAACCCC GGCCCGGGAG GCCTTTGACC TGATCACCCG GGAGCGGCTG
GAGTACCTGC CGGTAATCGA TAAGGAGGGC CGCTTGAAGG GACTGGTCAC CAGGACCAGC
ATGGTCAATG CCCTGGCATC CGTGGTGTGG GGAGATGAGG CTAGTGCTTA G
 
Protein sequence
MLTFEHVSKV YDGNRIAVAD FNLEVEAGEF IVLIGPSGCG KTTTLKMVNR LIEPTSGAIY 
LNGKDIREQN PVALRRHIGY VIQQIALFPN MTIAQNVDVV PRLLGWPAER RRQRVCELLE
LVGMDPDDYA DRYPSELSGG QQQRIGVLRA LAAEPPLILM DEPFGALDPI TRENLQEELK
ALQAKLHKTI LFVTHDMDEA LKIADRIVVM KDGYIVQVAA PEELLRHPAN EFVASFIGKE
RLAPGLELRT VEQVMIGEPV TVRPHTGVAE GVATMRRKKV DTLLVTDESG RLLGAVSIEE
LNRNYQRAHQ VQDLMARDVP VVFEGTPARE AFDLITRERL EYLPVIDKEG RLKGLVTRTS
MVNALASVVW GDEASA