Gene Caul_3942 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagCaul_3942 
Symbol 
ID5901404 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameCaulobacter sp. K31 
KingdomBacteria 
Replicon accessionNC_010338 
Strand
Start bp4266675 
End bp4268696 
Gene Length2022 bp 
Protein Length673 aa 
Translation table11 
GC content71% 
IMG OID641564463 
Productsulfotransferase 
Protein accessionYP_001685565 
Protein GI167647902 
COG category[R] General function prediction only 
COG ID[COG4783] Putative Zn-dependent protease, contains TPR repeats 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clones28 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones14 
Fosmid unclonability p-value0.749375 
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGTCGACCA CCACCACCGA GCCCTCCGGC AGCCTCGCCA CGGCGCTCGC CCACACCCAG 
CGCCTGCTGG CCGCCGATCC GGCCATGGCC GCCGAGCAGG CCCGCGCGAT CCTGGAGGCC
GTGCCGCGTC ACGCCGGGGC CACCCTGATG CTGGCCGCGG CCCTGCGGCT GTCGGGCGAC
CTCGACCAGG CCCTGGAGGT CGTCGATCCC CTGGCCCGCG CCCTGGTCCA GTCGCCGGAG
GTTCAGCTGG AGCACGGCTT GGTCCTGGCC CGCCTGGGCC AGACCCAGGC CGCGATCGCC
GCCTTCAAGC GCGCCACGGC CCTGGATCCC GACCTGGCCG AGGGTTGGCG GGGCCTCGCC
GAAGCGCTGG ACCTGGCCGG CGATGCGGCG GGCGCCCAGG CCGCCCAGGC CCGCCAGATC
AAGGCCGGCG TCCGCGATCC GGCCCTGATG AGCGCCGCCG CCGCCCTGGT CGATGGCAAG
CTGGGCGTCG CCGAGCAGAT CCTGCGCGAC GTGCTGCGGG TCCGGCCCGA CGAGCCGGCG
GCGATCCGGA TGCTGGCCGA GGTCGCCGCC CGGCTGGGTC GCCACGACGA CGCCGAGACC
CTGCTGGTCC GCTGCCTGGA GCTGGCCCCC GGCTTCACCG CCGCCCGTCA CAACCTGGCC
ACCGTGCTTT ACCGGCAGGG CCGCTCCGAG GACGCCCTGG TCGAGTTGGC GCAACTGCTG
GCCGGCGCGC CGCGCAACCC GGCCTATCTG AACCTCAAGG CCGCCGCCCT GGCCCGGATC
GGCGAATACG CCCAGGCCAT CGAGCTCCTG GAGGACGTGC TGGCCCGCTT CCCCCAGCAG
CCCAAGGGCT GGATGAGCTA TGGCCACGCG CTCAAGACCG TCGGCCGATC CGCCGACAGC
GTGGCCGCCT ACCGCAAGGC CGTCGACCAG GCCCCGTCGC TGGGCGAGGC CTGGTGGAGC
CTGGCCAACA TGAAGACCTA CCGCTTCGGC GACGCGGACC TGGCGGCGAT GGAAGCGGCG
CTGGCCCAGC AGGATCTCGG CGAGGACGAC CGCCTCCACC TGCACTACGC CCTGGGCAAG
GCCCACGAGG ACGCCGCCCG CTACGCCGAG TCCTTCGCCC ACTACGCCAG GGGCGCCGAC
CTGCGGCGGG CGCAGATCGC CTACGATCCC GGCGTGATCC GCGAGCATGT GGCGCGCGGC
AAGGCGGTCC TGACGGCGGA CCTGTTCGCG GCGCGGGCCG GCCAGGGCTG CCCCGCGCCC
GACCCGATCT TCATCCTGGG CCTGCCGCGG TCCGGCTCGA CCCTGATCGA ACAGATCCTG
GCCAGCCACT CGGCGGTCGA GGGCACGATG GAACTGCCCG ACATCACCTC GATGGCCCGC
CGGCTGAGCG GGGCCAAGAC CAGCAAGGAG GCCTCGGCCT ATCCCGAGAT CCTGGCGACC
CTGGGTCCGG AGGATCTCAA GGCGCTGGGC GAGGAGTTTC TCGAGCGCAC CCGGGTGCAG
CGCAAGACCG CCCGGCCGCT GTTCATCGAC AAGATGCCCA ACAACTGGGC CCATGTCGGG
CTGATCGCGC TGATGCTGCC CAACGCCAAG ATCATCGACG CGCGTCGCCA CCCGATGGGC
TGCTGCTTCT CGGGCTTCAA GCAGCACTTC GCCCGGGGCC AGAACTTCAG CTACGGCCTG
GACGACATCG GCCGCTACTA CGCCGACTAT GTCGAGCTGA TGGCCCATTT TGACGCCGTG
CTGCCGAGCC GCGTGCACCG GGTGATCTAC GAAGAGATGG TCGAGGATCC AGAAACCCAG
ATCCGCGCCC TGCTGGACTA TTGCGGCCTG CCGTTCGAAG CCGCCTGCCT GAACTTCCAC
GAAAACGACC GCGCCGTGCG GACCGCAAGT TCAGAACAGG TCCGCCGGCC GATCTTCAAG
GACGCGGTCG AGCATTGGCA GAACTACGAA TCGTGGCTGG GGCCGCTGAA GACCGCCCTG
GGTCCGGTCT TGGCCAGCTA CCCGGCCGCG CCTGAATTTT GA
 
Protein sequence
MSTTTTEPSG SLATALAHTQ RLLAADPAMA AEQARAILEA VPRHAGATLM LAAALRLSGD 
LDQALEVVDP LARALVQSPE VQLEHGLVLA RLGQTQAAIA AFKRATALDP DLAEGWRGLA
EALDLAGDAA GAQAAQARQI KAGVRDPALM SAAAALVDGK LGVAEQILRD VLRVRPDEPA
AIRMLAEVAA RLGRHDDAET LLVRCLELAP GFTAARHNLA TVLYRQGRSE DALVELAQLL
AGAPRNPAYL NLKAAALARI GEYAQAIELL EDVLARFPQQ PKGWMSYGHA LKTVGRSADS
VAAYRKAVDQ APSLGEAWWS LANMKTYRFG DADLAAMEAA LAQQDLGEDD RLHLHYALGK
AHEDAARYAE SFAHYARGAD LRRAQIAYDP GVIREHVARG KAVLTADLFA ARAGQGCPAP
DPIFILGLPR SGSTLIEQIL ASHSAVEGTM ELPDITSMAR RLSGAKTSKE ASAYPEILAT
LGPEDLKALG EEFLERTRVQ RKTARPLFID KMPNNWAHVG LIALMLPNAK IIDARRHPMG
CCFSGFKQHF ARGQNFSYGL DDIGRYYADY VELMAHFDAV LPSRVHRVIY EEMVEDPETQ
IRALLDYCGL PFEAACLNFH ENDRAVRTAS SEQVRRPIFK DAVEHWQNYE SWLGPLKTAL
GPVLASYPAA PEF