Gene Caul_4201 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagCaul_4201 
Symbol 
ID5901663 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameCaulobacter sp. K31 
KingdomBacteria 
Replicon accessionNC_010338 
Strand
Start bp4564313 
End bp4565701 
Gene Length1389 bp 
Protein Length462 aa 
Translation table11 
GC content73% 
IMG OID641564723 
Productpara-aminobenzoate synthase component I 
Protein accessionYP_001685823 
Protein GI167648160 
COG category[E] Amino acid transport and metabolism
[H] Coenzyme transport and metabolism 
COG ID[COG0147] Anthranilate/para-aminobenzoate synthases component I 
TIGRFAM ID[TIGR01824] aminodeoxychorismate synthase, component I, clade 2 


Plasmid Coverage information

Num covering plasmid clones19 
Plasmid unclonability p-value0.0880526 
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones26 
Fosmid unclonability p-value
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGGCGTCGA GCCCCGTGGG GCGGTATCAC CCGCCGATGC GCCACGTCGT CCTCATCAGC 
GCCCCCTGGC GCGACCCGGT CTGGGCCGCG GCCCCGTGGC GGGACGAGCC GTTCGCCTGC
GCCCTGATCT CCGGCGGCCA GGCGGGGCGG TGGTCTTACC TGCTGCGCGA TCCGGACGCC
GCGCTGGTCC TGGCCGCCGA CGACCCGCGC GATCCGTTCG CGGCCCTGGC CGAACTGATC
GGTCCGCGCC GGGGAATCGA GCCGGACGGA CCGCCGTTCC AGGGCGGGGT GGTCGGTCTG
GCCGCCTACG AGCTGGGCGA CCGGGTCGAG CCCCTGGGTC TGGCCCGCAC CGGCTGGCCC
GACCTGGCCT GCGCCCGCTA TCCCGCCCTG CTGGCCTTCG ACCACCTGCA GCGCCGGGTG
CTGGCGATCG GCCGGGGCGG GTCGAAAGGC TTCGCCCAGG CCCGGGCCGA GGCGGCCCTG
GCCTGGCTGG ACGCCCCGTC GCCCCCGATC AATGACGGCC CGCTCTGCGA GGCCTTGAGC
GTCAGCGACG GCGAGGCCTA CGAGGCGGCG GTGGCCCAGG TGGTCGAACG GATCGTCGAT
GGCGAGATCT TCCAGGCCAA CATCGCCCGG GCCTGGACCG GCCGGCTGAA CGACGGCGCC
CACCCGTTCG ACCTGTTCGC CCGCCTGCGG GCCGAGAGCC CCGCCCCGTT TTCGGCCTAT
CTGCGCCTGC CGGGCCGAGC CCTGGTCTCC AACTCGCCCG AACGGTTCCT CAAGGTCGAC
GCCAGGGAGG GCGGCGGCGA TCTGGCCATC GAGACCCGGC CGATCAAGGG CACCCGCCCG
CGCGGCGCCG ACCAGGCCGA GGACGCCCGG CTGATCGCCG AGCTGTCGGC CAGCGCCAAG
GACCGGGCCG AGAACCTGAT GATCGTCGAT CTGATGCGCA ACGACCTGGC CCGGGTCAGC
CCGCCCGGCA GCGTGGCGGT CCCCGAACTG TTCAAGGTCG AGACCTTCGC CAACGTCCAT
CACCTGGTCT CGACCGTCAC GGGCAAGTTG GCTCCGGGGC TGGCCGCCGC CGACCTGCTG
CGCGCCGCCT TTCCGCCCGG CTCGATCACC GGCGCGCCCA AGGTCCAGGC GATGAAGGTG
ATCGCCGAGC TGGAGACCCC GCGCGGACCC TATTGCGGTT CGCTGTTCTG GGCCGGGGTC
GATGGAGCCT TCGAATCGAG CGTTCTGATC CGCACCGTCG GCCTGGAACG GGACGAAACG
GGCTGGCGTC TGGAGGCGCG AGCCGGGGCC GGCATCGTCG CCGACAGCGA CCCCCAGGCC
GAGCGCCTGG AGACCGAGGC CAAGTTCGCC GCCCTGCGCC GCGGCCTGAT GGAGGAATCC
CGCGGATGA
 
Protein sequence
MASSPVGRYH PPMRHVVLIS APWRDPVWAA APWRDEPFAC ALISGGQAGR WSYLLRDPDA 
ALVLAADDPR DPFAALAELI GPRRGIEPDG PPFQGGVVGL AAYELGDRVE PLGLARTGWP
DLACARYPAL LAFDHLQRRV LAIGRGGSKG FAQARAEAAL AWLDAPSPPI NDGPLCEALS
VSDGEAYEAA VAQVVERIVD GEIFQANIAR AWTGRLNDGA HPFDLFARLR AESPAPFSAY
LRLPGRALVS NSPERFLKVD AREGGGDLAI ETRPIKGTRP RGADQAEDAR LIAELSASAK
DRAENLMIVD LMRNDLARVS PPGSVAVPEL FKVETFANVH HLVSTVTGKL APGLAAADLL
RAAFPPGSIT GAPKVQAMKV IAELETPRGP YCGSLFWAGV DGAFESSVLI RTVGLERDET
GWRLEARAGA GIVADSDPQA ERLETEAKFA ALRRGLMEES RG