Gene RPB_4049 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagRPB_4049 
Symbol 
ID3911856 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameRhodopseudomonas palustris HaA2 
KingdomBacteria 
Replicon accessionNC_007778 
Strand
Start bp4618620 
End bp4620239 
Gene Length1620 bp 
Protein Length539 aa 
Translation table11 
GC content63% 
IMG OID637885953 
Producttwin-arginine translocation pathway signal 
Protein accessionYP_487653 
Protein GI86751157 
COG category[E] Amino acid transport and metabolism 
COG ID[COG0747] ABC-type dipeptide transport system, periplasmic component 
TIGRFAM ID[TIGR01409] Tat (twin-arginine translocation) pathway signal sequence 


Plasmid Coverage information

Num covering plasmid clones13 
Plasmid unclonability p-value0.684503 
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones20 
Fosmid unclonability p-value
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGAAACGTC GCGACTTCCT GAAATCTGCC ACCGCCCTCG CTGCCGGAGC GATGGTGCCG 
GCGCCCGCGA TCTGGTCCGC TGCGAAGGCG GACGCACGGT CGGAATCCCT GCTGGTGGTG
TCCGAAAGCG GCCCCAACAA CCTCGACATC CACGGCATCG GCACCAATGT GCCGGGCTAC
GAGGTGTCGT GGAATTGCTA CGACCGGCTG ATCACCCACG AGATGAAGGA AGGCCCCGGC
GGGGTTCCCT ATTACGACAA GGACAAGTTC AGGGGCGAAC TCGCCGAGGA CATGGTGGTC
GGCGACATGT CGGCGACCTT CAAGCTGAAG AAGAACGCCA CCTTCCAGGA CGGCACCCCG
GTCACCGCCA AGGACGTCAA ATGGTCGCTC GATCGCTCGG TCAGCGTCGG CGGCTTCCCG
ACCTTCCAGA TGAGCGCCGG CTCGCTGACC AAGCCCGAGC AGTTCGTCGT CGTCGACGAC
CACACCGTGC GGGTCGACTT TCTCCGCAAG GACAAGCTGA CGATCCCGGA TCTCGCGGTG
ATCGTGCCCT GCGTCGTCAA TTCCGAACTG GTGAAGAAGA ACGCCACCGA GAAGGATCCG
TGGGGCCTCG AATACACCAA GCAGAACACC GCCGGCTCCG GCGCCTACCG CGTGGTCAAG
TGGACCGCCG GCACCGAAGT GATCATGGAG CGCAACGACA AATGGGTCGG CGGCCCGCTG
CCGAAGATCA AGCGCGTGAT CTGGCGCATG GTGCCGCAGG CCGGCAACCG CCGCGCCCTG
CTGGAGCGCG GCGACGCCGA CATCTCCTAT GAGTTGCCGA ACCAGGACTT CGCCGAGATG
AAGCGCGACG GCAAGATCAA CGTGGTGTCG CTGCCGATCT CGAACGGCAT CCAGTATCTC
GGCATGAACG TCACCCAGCC GCCGTTCAAC AATCCCAAGG TGCGCGAGGC GGTGGCCTAT
GCGGTGCCGT ATCAGAAGAT CATCGACGCG GTGATGTTCG GCCTCGCCAA TCCGATGTTC
GGGGCGGCCG CCGACAAGGC GACCGAAGTG AAGTGGCCGC AGCCGACCAA GTACAATACC
GACATCGCCA AGGCCAAGGC GCTGATGGCG GAAGCCGGCT ACGCCGGCGG CTTCGACACC
ACGCTGTCGT TCGACCTCGG CTTCGCCGGC GTCAACGAGC CGATGTGCAT CCTGATCCAG
GAGAGTCTCG CGCAGATCGG CATCCGCTGC ACGATCAACA AGATCCCCGG CGCCAACTGG
CGCACCGAGT TGAACAAGAA GGTGCTGCCG CTCTACGTCA ACATCTTTTC GGGATGGCTC
GATTACCCGG AATACTTCTT CTACTGGTGC TACCGCTCCG GCAATTCGAT CTTCAACACC
ATGAACTACA ACTCGCCCGA TATGGACAAG CTGATCGAAG GCGCCCGCGT CGCCGCGGCG
GCCGGCGACA TGCCGACCTA CGACACCGAC GTCAAGGGCT TCGTCGATCT GGCCTTCAAG
GACATCCCGC GCGTGCCTTT GTATCAGCCC TATCTCAACG TCGCGATGCA GAAGAACATC
TCCGGCTTCG CCTACTGGTT CCACCGCCGG CTGGACTACC GGACGATGGT GAAGGGCTGA
 
Protein sequence
MKRRDFLKSA TALAAGAMVP APAIWSAAKA DARSESLLVV SESGPNNLDI HGIGTNVPGY 
EVSWNCYDRL ITHEMKEGPG GVPYYDKDKF RGELAEDMVV GDMSATFKLK KNATFQDGTP
VTAKDVKWSL DRSVSVGGFP TFQMSAGSLT KPEQFVVVDD HTVRVDFLRK DKLTIPDLAV
IVPCVVNSEL VKKNATEKDP WGLEYTKQNT AGSGAYRVVK WTAGTEVIME RNDKWVGGPL
PKIKRVIWRM VPQAGNRRAL LERGDADISY ELPNQDFAEM KRDGKINVVS LPISNGIQYL
GMNVTQPPFN NPKVREAVAY AVPYQKIIDA VMFGLANPMF GAAADKATEV KWPQPTKYNT
DIAKAKALMA EAGYAGGFDT TLSFDLGFAG VNEPMCILIQ ESLAQIGIRC TINKIPGANW
RTELNKKVLP LYVNIFSGWL DYPEYFFYWC YRSGNSIFNT MNYNSPDMDK LIEGARVAAA
AGDMPTYDTD VKGFVDLAFK DIPRVPLYQP YLNVAMQKNI SGFAYWFHRR LDYRTMVKG