Gene Pars_1420 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagPars_1420 
Symbol 
ID5054871 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism namePyrobaculum arsenaticum DSM 13514 
KingdomArchaea 
Replicon accessionNC_009376 
Strand
Start bp1281092 
End bp1282360 
Gene Length1269 bp 
Protein Length422 aa 
Translation table11 
GC content62% 
IMG OID640468961 
Productanthranilate synthase component I 
Protein accessionYP_001153630 
Protein GI145591628 
COG category[E] Amino acid transport and metabolism
[H] Coenzyme transport and metabolism 
COG ID[COG0147] Anthranilate/para-aminobenzoate synthases component I 
TIGRFAM ID[TIGR01820] anthranilate synthase component I, archaeal clade 


Plasmid Coverage information

Num covering plasmid clones17 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones24 
Fosmid unclonability p-value0.150343 
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGAGGAAGA TCCCCCTCTC CAAGCTACCG GAGCCCAGGG CCTTGGCGAA GTCCTTATAT 
GCCCGAGGCG AGGACTTCGT GGCGTTGCTG GAGTCGGGTG TGGGATACGC CGAGCGTAGC
CGCTTCACCC TAGTGGCTTG GGGGGTGGAG GAGGAGTACG TGTCGTGGGG GGCCGACGTC
TACCAAATTG TGGACAGCGC CTACAAGAGA TTGAGGAGGG GCCCCACCCC ATTCGGCGGC
GAGGTGGCCA TAGGCGCTGT TGCCTACGAC GCGGTTGCCT ACCTTGAGCC TGTGTTGCTT
AGGTATGGCA AGTTGGATAA GTCGTCTCCT GTCGCCTTCT TCGTAAAGCC CAGGGGCTGG
GCGCTGTACG ACAAGTTGCT GGGCCGGGCT TACGTCTACG GGGAGTTGCC AAACGGCGGG
GCCGCGTCGC TGGAGTCGCC GATGGTGAGG GGGCCAATCG CCGAGACCGA CGCCTCTTCG
TTTAAGAGGT GGGTGGCTGA GGCGAAGAGG AGGATAGAGG AGGGGGAGAT CTTCCAGGTG
GTGCTGTCGC GCCACGTGGA CTTCGCCGTG TCTGGAGACG TGTTTGCCCT ATACGCCTCG
CTGGCGTCTG TCAACCCGTC GCCGTATATG TTCTTCGTCA AGTGGAGGGA CTTCCAACTG
CTGGGAACCT CGCCTGAGTT GCTGGTAAAG ATCCAGGGGG ACAGGGCGGA GACGCACCCA
ATTGCCGGAA CTAGGCCGAG GGGGGCCGCC GAGGATGAGG ACTTGGCGCT GGAGGAGGAG
ATGCTCGCAG ACGAGAAGGA GCGGGCGGAG CACTACATGC TTGTTGACCT GGCTCGCAAC
GACTTGGGCA GAGTCTGCCG GCCGGGGACT GTGAAGGTGG ATGAGCTGTT CGCTGTGGAG
AAGTACAGCA GGGTGCAACA CATCGTGTCG AGGGTCTCGT GCGTCTTGGA GAAGAAGTAC
ACGCCAGCAG ACGCCCTCTT CGCCACGCAC CCCGCCGGCA CTGTGTCGGG GGCGCCGAAG
GTGAGGGCCA TGGAGATAAT CGCCGAGCTG GAGGACGAGC CGAGGGGTTA CTACGCCGGC
TCGCTGGGGT TCCTCTCCCC GGCACTGTCC GAGTTCGCCA TAGTCATTAG GACAGCCATC
GTGAAGGGGG GAGTGCTGAG GATACAGGCG GGGGCGGGGG TGGTATATGA CTCCACGCCG
GAGAGGGAGT TTAGGGAAAC CGAGGCTAAG CTTAAAGCCC TTAGAGAGGC GCTGGGGCTA
TGGACCTGA
 
Protein sequence
MRKIPLSKLP EPRALAKSLY ARGEDFVALL ESGVGYAERS RFTLVAWGVE EEYVSWGADV 
YQIVDSAYKR LRRGPTPFGG EVAIGAVAYD AVAYLEPVLL RYGKLDKSSP VAFFVKPRGW
ALYDKLLGRA YVYGELPNGG AASLESPMVR GPIAETDASS FKRWVAEAKR RIEEGEIFQV
VLSRHVDFAV SGDVFALYAS LASVNPSPYM FFVKWRDFQL LGTSPELLVK IQGDRAETHP
IAGTRPRGAA EDEDLALEEE MLADEKERAE HYMLVDLARN DLGRVCRPGT VKVDELFAVE
KYSRVQHIVS RVSCVLEKKY TPADALFATH PAGTVSGAPK VRAMEIIAEL EDEPRGYYAG
SLGFLSPALS EFAIVIRTAI VKGGVLRIQA GAGVVYDSTP EREFRETEAK LKALREALGL
WT