Gene CPF_0221 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagCPF_0221 
Symbol 
ID4203507 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameClostridium perfringens ATCC 13124 
KingdomBacteria 
Replicon accessionNC_008261 
Strand
Start bp269876 
End bp271321 
Gene Length1446 bp 
Protein Length481 aa 
Translation table11 
GC content32% 
IMG OID638081105 
Productarylsulfatase 
Protein accessionYP_694684 
Protein GI110799572 
COG category[P] Inorganic ion transport and metabolism 
COG ID[COG3119] Arylsulfatase A and related enzymes 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clones
Plasmid unclonability p-value0.0847512 
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clonesn/a 
Fosmid unclonability p-valuen/a 
Fosmid Hitchhikern/a 
Fosmid clonabilityn/a 
 

Sequence

Gene sequence
ATGAAGCCAA ATATTGTGTT AATCATGGTT GACCAGATGA GAGGAGATTG TCTAGGGGTT 
AATGGAAATG AATTTATAGA AACTCCAAAC TTAGATATGA TGGCCACTGA AGGATATAAC
TTTGAAAATG CTTATACAGC AGTTCCAAGT TGCATTGCAT CTAGGGCATC TATTTTAACA
GGCATGAGCC AAAAATCTCA TGGAAGAGTT GGGTATGAAG ATGGAGTTTC ATGGAATTAT
GAAAATACCA TAGCTTCAGA ATTTTCAAAA GCTGGTTATC ATACTCAATG TATTGGGAAA
ATGCATGTTT ATCCAGAGAG AAATCTTTGT GGATTTCATA ATATTATGTT ACATGATGGG
TATTTACACT TTGCCAGAAA TAAAGAAGGA AAAGCATCTA CTCAAATTGA ACAATGTGAT
GATTATTTAA AGTGGTTTAG AGAAAAGAAA GGGCATAATG TTGATTTAAT AGATATAGGA
CTTGATTGCA ATTCGTGGGT ATCTAGACCT TGGGGATATG AGGAAAACTT ACACCCTACT
AATTGGGTGG TTAATGAATC AATAGATTTT TTAAGAAGAA AAGATCCAAG TAAGCCATTC
TTTTTAAAGA TGTCCTTTGT TAGACCACAT TCCCCTTTAG ATCCACCAAA GTTTTATTTT
GATATGTATA AGGATGAGGA TTTACCAGAA CCTTTAATGG GAGATTGGGC TAATAAAGAG
GATGAAGAAA ATAGAGGAAA AGATATAAAT TGTGTAAAGG GAATAATTAA TAAGAAGGCA
TTAAAAAGAG CTAAAGCTGC TTATTATGGG TCAATAACTC ATATTGATCA TCAAATAGGG
AGATTTTTAA TAGCCTTATC AGAGTATGGG GAATTAAATA ATACAATATT CTTATTTGTT
TCTGACCATG GAGATATGAT GGGAGATCAT AATTGGTTTA GAAAGGGAAT TCCTTATGAA
GGAAGTTCTA GAGTTCCATT TTTTATTTAT GACCCAGGAA ACTTATTAAA AGGGAAAAAA
GGAAAAGTAT TTGATGAAGT TTTAGAGTTA AGAGATATTA TGCCAACTTT ATTAGACTTT
GCTCATATTT CTATACCTGA TTCAGTAGAA GGATTAAGCC TTAAGAATTT AATAGAGGAA
AGAAATTCTA CTTGGAGAGA TTATATTCAT GGAGAACATT CATTTGGAGA GGATTCTAAC
CACTACATTG TAACAAAGAG AGATAAGTTT TTGTGGTTTT CTCAAAGAGG GGAAGAACAA
TATTTTGATT TAGAGAATGA TCCAAAGGAA CTTACTAATC TTATAGATTC AGAAGAGTAT
AAAGAGAGAA TAGATTACTT AAGAAAAATA TTAATTAAAG AGCTTGAAGG AAGAGAAGAA
GGATATACTG ATGGAAATAG ACTTTTAAAA GGACATCCAG TAAGCACTTT AAAACATATA
AGATAA
 
Protein sequence
MKPNIVLIMV DQMRGDCLGV NGNEFIETPN LDMMATEGYN FENAYTAVPS CIASRASILT 
GMSQKSHGRV GYEDGVSWNY ENTIASEFSK AGYHTQCIGK MHVYPERNLC GFHNIMLHDG
YLHFARNKEG KASTQIEQCD DYLKWFREKK GHNVDLIDIG LDCNSWVSRP WGYEENLHPT
NWVVNESIDF LRRKDPSKPF FLKMSFVRPH SPLDPPKFYF DMYKDEDLPE PLMGDWANKE
DEENRGKDIN CVKGIINKKA LKRAKAAYYG SITHIDHQIG RFLIALSEYG ELNNTIFLFV
SDHGDMMGDH NWFRKGIPYE GSSRVPFFIY DPGNLLKGKK GKVFDEVLEL RDIMPTLLDF
AHISIPDSVE GLSLKNLIEE RNSTWRDYIH GEHSFGEDSN HYIVTKRDKF LWFSQRGEEQ
YFDLENDPKE LTNLIDSEEY KERIDYLRKI LIKELEGREE GYTDGNRLLK GHPVSTLKHI
R