Gene Acid345_2021 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagAcid345_2021 
Symbol 
ID4070351 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameCandidatus Koribacter versatilis Ellin345 
KingdomBacteria 
Replicon accessionNC_008009 
Strand
Start bp2419085 
End bp2420365 
Gene Length1281 bp 
Protein Length426 aa 
Translation table11 
GC content59% 
IMG OID637984035 
ProductTPR repeat-containing protein 
Protein accessionYP_591096 
Protein GI94969048 
COG category[R] General function prediction only 
COG ID[COG4783] Putative Zn-dependent protease, contains TPR repeats 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clones18 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones19 
Fosmid unclonability p-value
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGCGTCGTG CGTGGCTGGT GGCAATTCTG GCGCTTGGCT GTGCATCGGC GCAAACGCGT 
CAGCCTGCGA AGGTGGCTGC AGCGTCTCCG GTGGACCAGG CCGAAACGGC GATCGGCAAG
CAGGATTGGG CGGCGGCCGA GACACTTCTG AAGGACGCCA CCTCGCAGGA GCCCAAGGAC
TACCGGGCGT GGTTCGACCT CGGCTACGTT TACACCTCCC AAGACAAGAC CCGAGAGGCA
GCCGAGGCAT ACCGGTATTC AGTCGACGCC AAGCCGGACA TCTTTGAGAG CAATCTGAAT
CTCGGTATCT CACTTGCGAA ACTTGGCAAT CCAGACGCGG CGAAATACCT GGCCGTGGCT
ACGACGCTGA AACCCACAAG CCATCCGGAA GAGGGATATT TCCGGGCATG GCTGTCGCTG
GGGCACGTTC TGAGCAAGGA ATCCCCGCAG CGGGCGGCCG AGGCCTATCA GCAGGCCGCG
AAGTTCAAGT CGAAAGATCC GGAGCCGCAT CTGAGCGCGG CGCAGATGTA TGAAATAGCG
AAGGACACGG CGGGTGCGGA GCGCGAGTAT CAGGTAGTCC TGGCGCTGGA TCCTGGCTCA
AAAGAGGCAA TCACCGGGCT GGCGAACATC TACCTGAACG CGAAACGGCT ACCAGAATCC
GAGACCATGC TGCGGAAGAT TCTGGCGGGC GATCCAACGA ACAGCAACGC ACAGCTGCAG
TTGGCGCGGG TTCTGGCAGC TGAGAACAAG GACGATGACG CGACGGCCGC GTACGACGCC
GCTCTCAAGC TGCTTCCGAA TGACGGGGAA GCCCAGAAGT CGGCAGCGGA TTTCTATCTT
GCGGCCAAGA AGTATAAAGA GGCGGCGGCG GCCTACGCAC AATTGGTGCA GGCGAAGCCG
AATGACGCCG CCCTGCGTGA GTTGTACGGG AATGCCCTAC TGCGGCTTCA TAAGAACGCG
GAGGCGCAGG AGCAGGCGCT GATTGCTATC AAGCTGAATC CGAACATGGG AGAGGCGTAC
AACGACCTGG CGTTTGCTGC CGCCGAGAAC AAAGATTATG CGCTGTCGCT CAAGGCGTTG
GACGCGCGGG CAAAGTTCTA TCCGGAAAAT CAGGGTACCT ACTTCCTTCG TGCGACGAAT
TACGATAATC TCCGCTTAGT AAAAGACGCA ATCGCGGCCT ATAAGAAGTT CCTGGCGGTA
TCGGATGGCA AATTTCCCGA CCAGGAATGG CAGGCGCGGC ACCGTCTTAT CGCAATAGAC
CCTGAATCGA GAAAGAAATG A
 
Protein sequence
MRRAWLVAIL ALGCASAQTR QPAKVAAASP VDQAETAIGK QDWAAAETLL KDATSQEPKD 
YRAWFDLGYV YTSQDKTREA AEAYRYSVDA KPDIFESNLN LGISLAKLGN PDAAKYLAVA
TTLKPTSHPE EGYFRAWLSL GHVLSKESPQ RAAEAYQQAA KFKSKDPEPH LSAAQMYEIA
KDTAGAEREY QVVLALDPGS KEAITGLANI YLNAKRLPES ETMLRKILAG DPTNSNAQLQ
LARVLAAENK DDDATAAYDA ALKLLPNDGE AQKSAADFYL AAKKYKEAAA AYAQLVQAKP
NDAALRELYG NALLRLHKNA EAQEQALIAI KLNPNMGEAY NDLAFAAAEN KDYALSLKAL
DARAKFYPEN QGTYFLRATN YDNLRLVKDA IAAYKKFLAV SDGKFPDQEW QARHRLIAID
PESRKK