Gene Msed_1046 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagMsed_1046 
Symbol 
ID5104428 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameMetallosphaera sedula DSM 5348 
KingdomArchaea 
Replicon accessionNC_009440 
Strand
Start bp973665 
End bp975692 
Gene Length2028 bp 
Protein Length675 aa 
Translation table11 
GC content47% 
IMG OID640506942 
Productextracellular solute-binding protein 
Protein accessionYP_001191135 
Protein GI146303819 
COG category[E] Amino acid transport and metabolism 
COG ID[COG0747] ABC-type dipeptide transport system, periplasmic component 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clones21 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones27 
Fosmid unclonability p-value
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGAAACTTC GAAAAGGTAT AAGTCGCACG TTGGTAGCAG GTATAATTGT GGTAATTGTT 
ATAATAGCTG CAGTGGCGTT TATTTCCCTA TCAAGACATC CCACAACAAC TACCCCAGTC
AACGTGACCC ATACCACTAA CACGACCACC ACTAAGACTA ACGTGACTGT GCCCACAACG
AATGTCACAT CGATTCCCTC ATCAATTACA GTAGACGAGG CAACTCCGCC AGTTAGCGTG
GATCCGGCGT CCAGTTTTGA CGTTGCTGGG GGAGAATTGC TACAGAACGT GTATCAAACC
CTCGTGTTCT ACAACGGAAC TAATACGTCC AGTTTTGTGG GAGTCTTGGC AGAAAATTAC
ACAGTATTGA ACAACGGAAC CACTTACGTG TTCCACCTCT GGCCATTTAT AACCTTCAGC
AATGGTGATC CATTGAACGC AACCGACGTC TGGTTCTCGG TGTACAGAAC AATGTTAATG
AATCTTGGAA TCTCGGTCTA CGTTAGTCAA GGACTATCAG TCAATAACGG ACTTGGATTT
GTAGGGAAGC TACCCAGTGG CGCTACCGGG ACTATCGAAT TGCCCAACGG TACTCTTCAG
GCCCTCGAAT ATGCTGGTTA CACTTTCCCC TCAAACAAAA CTCAGGCATA CGAACAGGCA
GCCTATGATT TGGCCTATAT TCTTTCCCAC TTCAATGTAA GCAATCAGAC CATTCAAAAG
GTCATGAGTT ATCCCTATCA GGCAGTGGTC GTGGTGAATC CCTATACTGT TCAATTCAAC
CTTCAGTATC CATACTCTGC ATTCCTAGCT GCAATTTCCA CATCTGCTGG AGCCGTAGTT
GATCCAGTCT TCGTAGACGA ACATGGAGGA GTCCAGATAG ATACTGCCAA CACTTACCTT
TCAACTCACG CCCTAGGCTC AGGTCCCTAC ATGTTGGAAA CTCCACTGGG TCAGTCCTAC
GTAATCCTGC AGGCAAATCC CAATTACTGG GCGTCCAAGG TTCCTCAGTC TGAGAGAAAC
TTCATGTTAG CTATTCCCAA GATCGAGACC ATCGTAATAG ATTATCAGAC CAACGAGGCA
CTTAGGATAA GCGACCTTCA ATCCGGGAAG GCCCAAATAG CCCAGATAGA TATTATAGAC
TTACCCCAGA TAATTGGCTC GCAGGGGATT TCCTACATAA GGACCCATAC ACATTATCCT
ATCATGTATA ATGGAACCTA TGGGACAGTC TATGTTTGGG GACCATCCCC GCAGATTGAC
TTCTTGGCAA TAGATGCTTA CCAGTATCCG TTCAACATCA CCAATGTGAG GCTTGCCATC
GCGCATGCCA TAAATGCTAC GCAGATTCAA CAACAGGTCT ACGACGGATT AGCCCTAAGC
TATGTTGGTC CCAATGATCC GTCCCTTCCC TTCTACAACT CCTCGATTCA AGGCTATACC
TATGACCCCG CCCTTTCCAT AAATCTGCTG ACCCAAGCAG GATTTAGCTT AACCCTTCCC
AACGGCACAA CGGTTAACCC AGGTGGAACG CCCTTCCCAA CCATAGTTTT GACCTACCAG
ACAGGTAGCA CGGCACTTCA AGATGAGGCC CTCATTATAC AGCAGCAGTT AGCTCAGATA
GGAATAAAGG TCCAGTTAAA CCCCGAGTCC ACAGTTACCA TTGTAGAGTC CTACCTCAAC
CCACCCAATT CCAGCAGCTA TCCAGCCTTC CAGCTTGCAG CTAACTTCCC ACCTGTCCTC
AGCCCCATTG ATCCTGCCAT CTACCTAATG TCTCAGGCGA GGCTTCACCA CGGCAATCCT
GCTTTCGTCG ATAATCCAGA GATAAACTCT CTCATAATTC AAGCAGTGAG GACTGATAAT
CCAGTTCAGC TCCAGAAAAT CTTCAATGAG ATAACAGAAC TCTCGCTACA GCAGGCCCAA
TACGTATGGC TCGATGACTT CATCGCATAT ACAGTGACTG TCCATGGAAT TCAAGGAATA
TATTACAGTC CAGGGTTTGA CGGATTGTTC TACGCAACAA TATACTAG
 
Protein sequence
MKLRKGISRT LVAGIIVVIV IIAAVAFISL SRHPTTTTPV NVTHTTNTTT TKTNVTVPTT 
NVTSIPSSIT VDEATPPVSV DPASSFDVAG GELLQNVYQT LVFYNGTNTS SFVGVLAENY
TVLNNGTTYV FHLWPFITFS NGDPLNATDV WFSVYRTMLM NLGISVYVSQ GLSVNNGLGF
VGKLPSGATG TIELPNGTLQ ALEYAGYTFP SNKTQAYEQA AYDLAYILSH FNVSNQTIQK
VMSYPYQAVV VVNPYTVQFN LQYPYSAFLA AISTSAGAVV DPVFVDEHGG VQIDTANTYL
STHALGSGPY MLETPLGQSY VILQANPNYW ASKVPQSERN FMLAIPKIET IVIDYQTNEA
LRISDLQSGK AQIAQIDIID LPQIIGSQGI SYIRTHTHYP IMYNGTYGTV YVWGPSPQID
FLAIDAYQYP FNITNVRLAI AHAINATQIQ QQVYDGLALS YVGPNDPSLP FYNSSIQGYT
YDPALSINLL TQAGFSLTLP NGTTVNPGGT PFPTIVLTYQ TGSTALQDEA LIIQQQLAQI
GIKVQLNPES TVTIVESYLN PPNSSSYPAF QLAANFPPVL SPIDPAIYLM SQARLHHGNP
AFVDNPEINS LIIQAVRTDN PVQLQKIFNE ITELSLQQAQ YVWLDDFIAY TVTVHGIQGI
YYSPGFDGLF YATIY