Gene RPB_1954 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagRPB_1954 
Symbol 
ID3908033 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameRhodopseudomonas palustris HaA2 
KingdomBacteria 
Replicon accessionNC_007778 
Strand
Start bp2220611 
End bp2222551 
Gene Length1941 bp 
Protein Length646 aa 
Translation table11 
GC content65% 
IMG OID637883848 
Productthiamine biosynthesis protein ThiC 
Protein accessionYP_485573 
Protein GI86749077 
COG category[H] Coenzyme transport and metabolism 
COG ID[COG0422] Thiamine biosynthesis protein ThiC 
TIGRFAM ID[TIGR00190] thiamine biosynthesis protein ThiC 


Plasmid Coverage information

Num covering plasmid clones15 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones19 
Fosmid unclonability p-value
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGAATATCC GCTCCAATCC CGACACCACG CGTCCCGCCG TCACCACCGG CGGCCTGCCC 
TCCTCGCGCA AGATCTATGC TGTGCCCGCC ACCGCGCCGG ATCTGCGCGT GCCGCTGCGC
GAGATCATGC TGAGCGAAGG CGCCGGCGAG CCGAACCTGC CGGTGTACGA CACCTCGGGC
CCCTACACCG ATCCCGGCGT CACCATCGAC GTCAACAAGG GCCTGTCGCG CGCCCGCACG
GAGTGGGTCA AGCAGCGCGG CGGCGTCGAG CAATATGAAG GCCGCGACAT CAAGCCGGAA
GACAACGGCA ATGTCGGCGC AGCCCATGCG GCGAAGTCGT TCACCGCGCA TCACCAGCCG
CTGCGCGGCG TCGGCGACGC GCCGATCACG CAATACGAAT TCGCCCGCAA GGGGATCATC
ACCAAGGAGA TGATCTACGT CGCCGAGCGC GAGAATCTCG GCCGCAAGCA GCAACTGGAG
CGCGCCGAAG CCGCGCTGGC CGACGGTGAA AGCTTCGGCG CCGCGGTGCC GGCGTTCATC
ACCCCGGAAT TCGTCCGCGA CGAGATCGCG CGCGGCCGCG CCATCATCCC GGCCAACATC
AATCACGGCG AACTCGAGCC GATGATCATC GGCCGCAACT TCCTCACCAA GATCAACGCC
AACATCGGCA ACTCGGCGGT GACCTCGTCG GTCGAGGAGG AAGTCGACAA GATGGTGTGG
GCGATCCGCT GGGGCGCCGA CACCGTGATG GACCTCTCCA CCGGCCGCAA CATCCACACC
ACCCGCGAAT GGATTTTGCG CAACTCGCCG GTCCCGATCG GCACCGTACC GATCTATCAG
GCGCTGGAGA AGTGCGACGG CGATCCGGTC AAGCTCACCT GGGAGCTGTA CAAGGACACG
CTGATCGAGC AGGCCGAACA GGGCGTCGAT TACTTCACGA TCCACGCCGG CGTGCGGCTG
CAATACATCC ACCTCACCGC CGATCGCGTC ACGGGCATCG TTTCCCGTGG CGGATCGATC
ATGGCGAAGT GGTGCCTGGC GCATCACCAG GAGAGCTTCC TCTACACGCA TTTCGACGAG
ATCTGCGACC TGATGCGGAA ATACGACGTG TCGTTCTCAT TGGGCGACGG GCTGCGCCCG
GGCTCGATCG CCGACGCCAA CGACCGCGCG CAATTCGCCG AACTGGAGAC GCTCGGCGAA
CTCACCAAGA TCGCCTGGGC CAAGGGCTGC CAGGTGATGA TCGAAGGCCC CGGCCACGTG
CCGCTGCACA AGATCAAGAT CAACATGGAC AAGCAGCTCA AGGAATGCGG CGAGGCGCCG
TTCTATACGC TCGGCCCGCT GACCACCGAC ATCGCGCCGG GCTATGATCA CATCACCTCC
GGCATTGGCG CCGCGATGAT CGGCTGGTTC GGCTGCGCGA TGCTGTGCTA CGTCACGCCG
AAGGAACATC TCGGCCTGCC CGACCGCAAC GACGTCAAGA CCGGCGTGAT CACCTACAAG
ATCGCCGCCC ACGCCGCCGA CCTCGGCAAG GGCCACCCCG CAGCCCAGTT GCGCGACGAC
GCGCTGTCCC GCGCCCGCTT CGACTTCCGC TGGCAGGACC AATTCAACCT CGGCCTCGAC
CCGGATACGG CGAAAGCCTT CCACGACGAA ACCCTGCCCA AGGAAGCCCA CAAGGTCGCG
CATTTCTGCT CGATGTGCGG CCCGAAATTC TGCTCGATGA AGATCACGCA AGATGTGCGC
GACTACGCGG CGGGATTGGG CGACAACGAG AAGGCGGCGC TGAACCTCGC CGGCGCCGGC
TCGTTCGGCA GCGTCGGCAT GACGATGTCC GGCGTCATCG AAGACGGCAT GGCGCAGATG
AGCGAGAAGT TTCGGGATAT GGGGGAGAAG CTGTATCTCG ATGCGGAGAA GGTGAAGGAG
AGCAACAAGG CGCTGTCGTA G
 
Protein sequence
MNIRSNPDTT RPAVTTGGLP SSRKIYAVPA TAPDLRVPLR EIMLSEGAGE PNLPVYDTSG 
PYTDPGVTID VNKGLSRART EWVKQRGGVE QYEGRDIKPE DNGNVGAAHA AKSFTAHHQP
LRGVGDAPIT QYEFARKGII TKEMIYVAER ENLGRKQQLE RAEAALADGE SFGAAVPAFI
TPEFVRDEIA RGRAIIPANI NHGELEPMII GRNFLTKINA NIGNSAVTSS VEEEVDKMVW
AIRWGADTVM DLSTGRNIHT TREWILRNSP VPIGTVPIYQ ALEKCDGDPV KLTWELYKDT
LIEQAEQGVD YFTIHAGVRL QYIHLTADRV TGIVSRGGSI MAKWCLAHHQ ESFLYTHFDE
ICDLMRKYDV SFSLGDGLRP GSIADANDRA QFAELETLGE LTKIAWAKGC QVMIEGPGHV
PLHKIKINMD KQLKECGEAP FYTLGPLTTD IAPGYDHITS GIGAAMIGWF GCAMLCYVTP
KEHLGLPDRN DVKTGVITYK IAAHAADLGK GHPAAQLRDD ALSRARFDFR WQDQFNLGLD
PDTAKAFHDE TLPKEAHKVA HFCSMCGPKF CSMKITQDVR DYAAGLGDNE KAALNLAGAG
SFGSVGMTMS GVIEDGMAQM SEKFRDMGEK LYLDAEKVKE SNKALS