Gene Information Plasmid Coverage information Fosmid Coverage information Sequence |
Gene Information |
Locus tag | Haur_4078 |
Symbol | |
ID | 5735936 |
Type | CDS |
Is gene spliced | No |
Is pseudo gene | No |
Organism name | Herpetosiphon aurantiacus ATCC 23779 |
Kingdom | Bacteria |
Replicon accession | NC_009972 |
Strand | - |
Start bp | 5208304 |
End bp | 5211090 |
Gene Length | 2787 bp |
Protein Length | 928 aa |
Translation table | 11 |
GC content | 52% |
IMG OID | 641281229 |
Product | DNA mismatch repair protein MutS |
Protein accession | YP_001546838 |
Protein GI | 159900591 |
COG category | [L] Replication, recombination and repair |
COG ID | [COG0249] Mismatch repair ATPase (MutS family) |
TIGRFAM ID | [TIGR01070] DNA mismatch repair protein MutS |
|
|
Plasmid Coverage information |
Num covering plasmid clones | 7 |
Plasmid unclonability p-value | 0.980246 |
Plasmid hitchhiking | No |
Plasmid clonability | normal |
| |
Fosmid Coverage information |
Num covering fosmid clones | n/a |
Fosmid unclonability p-value | n/a |
Fosmid Hitchhiker | n/a |
Fosmid clonability | n/a |
| |
Sequence |
Gene sequence | ATGTCAAAAA TGACGATGTG GCAGCAATAT CTGTCGATTA AGCAAAAATA TGCCGATGTG ATTCTGTTTT TCCGGCTAGG CGATTTTTAC GAAACTTTCG GCGATGATGC CAAGTTGATT GCCGAGGTGC TTGATATTAC CCTGACGGTG CGTGGCCTCA GCAGCGATGA AAATACGCCG ATGGCGGGTG TGCCCTATCA CGCCGCCGAT AATTATATTG AGCAACTGGT CAGTCGGGGC TATCGCGTGG CAATCTGCGA ACAAATGGAT GAGATGGTGC ACAAAACCTT GCAAAAACGT GAGGTTGTGC GGATTGTCAC GCCAGGCACT CTGACCGAGC CAACCATGCT CCAAGCTGAA CGCAATAGCT ACCTTGCGGC AATTTTGGTT GATCGTGGCA ATGTTGGTTT GGCGTATGCC GACCTAACCA CCGGCGAGTT TTGTGCCACC GAATTGCGCG GCAACGAGGC TTTGAAGCAA CTTGAAGGCG AATTAGCGCG TTTGGGTGCG GCTGAACTCT TGGTTTCCGA TGCTCCTGAG TTGCGTCCAG CCGGCATGGA AATTGCCAAA AAGCAGTTAG CCCAAGATCT TGCACCAATG CGCAAGGCTG AACGCGAACG CTTGCTACCC CACGAACGCA CCGCCAAAAA AGTTGAAGGT AATAACGAAA GCACGTGGGT TCAAGGCAAT GTCACCCAAT GGCCTAATTG GCATTGGGAT GCCCGCACCG CTCGCGATGC CTTGCTCAAT CAATTTAAAA GCCAATCGCT TGATGGCTTT GGCTTGGGCA ATAAAGCCCT CGCAACCCGC GCCGCAGGAG CATTAATTCA ATATTTGCAT GAAACGCAGC GCGATAGTGT GGCCCAAGTT CGCAGCTTGC GGGTCTATGA TACAACTCGT TTTATGTTTC TCGACCCCCA AACGCGGCGT AATTTGGAGC TAACCGAAGG TGCTGGTGGC CAACGCAAAG GCTCGTTGAT TGCGGTGCTC GACCAAACCC GCACGCCGAT GGGTGCACGC CTGTTACGCC AATGGATCTC ACAACCGCTG ATTGAGCTTG GCCCACTGAC CGAGCGCCAA CAAGCCGTCA GTTGTTTTGT TGAAGAAACC TTGGTCCGCG GCGAGTTACG GGCCTTATTC AAAGGGGTTG GCGATATCGA ACGCACAATC AATCGGGTGG TGCAAGGCAT TGCCACCCCG CGTGATTTAG TGCGATTGCG CGAAGCGCTG CGCCTAACTC CCGATATTTT GAGCCAAATC GAGCGTACAG GTTTGCGCTC AACCAGCCCA ACCGAGGCTG CGCCAAGTGA TGATGATCTG TTTGATGACG AGCCAACAAG CAATCAAATC GATGCTTGTG CCGATATTTG CGAATTACTC GAACAGGCGA TTGCCGATGA TCCGCCCGCC TTGCTTGGCA CATGGGATAA CGCCCGCAGC GACGAAAATG TGATTCGCAA GGGTCATGCT GCCGAAATTG ATGCAATTGT TGAGGCTACT CGCGATGCCG CCCGTTGGAT CAACGAACTT GAAGCCAAGG AACAGCAACG CACTGGCATC AAAACGCTCA AAGTCAGCTA TAACAAAGTC TTTGGCTATT ACATCGAGGT AACCAAGGCC AGCGGCGAAA CCCGCATTCC CGATGATTAC ATTCGCAAAC AAACCTTGGT CAATGCCGAG CGCTATATCA CGCCAGAACT CAAAGAATAT GAATCGCTGA TTTTGAATGC CTCAGAAGCC TTGAACGAAA AAGAGCGCCA AGCATTTCGC CTGATTTTGC GCCATTTGGC CAACGCTGGC AATCGTTTGC TCGATTTAGC GCGAGCAATC GCCGAGTTTG ATGTCTATAG CACCTTGGCC GAGGTGGCGG TGCGTCAGCG TTTTGTGCGG CCAACCTTGC GCCTCGACGA TGTATTTGTG ATTCAAGGCG GGCGACATCC TGTGGTTGAG CACAATTTGA ACGAGCCATT TACCCCGAAT GATGCTCATT TTGATGCTGA CCATCAGATT ATTGTGCTGA CTGGACCAAA CATGTCGGGC AAAAGCACCT TTTTGCGCCA AGTGGCTTTG ATTGGCTTAA TGGCCCAAAT CGGCTCGTTT GTGCCCGCTG ATTATGCCGA AATTGGCCTA CTCGACCGAA TTTTCACGCG GATTGGGGCA CAAGACGATA TTGCTACCGG CCAATCGACC TTTATGGTTG AAATGATCGA GACCGCCAAT ATTTTACATA ATGGGTCACC ACGATCGCTG ATTATTCTCG ATGAAATTGG CCGTGGCACC AGCACCTACG ACGGGCTTTC GATTGCCCGC GCTGTGGTCG AATATATTCA TAATCAGCCG CGCTTACGAG CCAAAACCCT GTTTGCAACC CACTACCACG AACTGACCGA GCTGGCTAAC ATCTTGCCAC GGGTGCATAA TTGGACGTTG GCGGTGGCCG AAGAAGGCGA TCATGTAGTG TTTTTGCGCA AAGTGATCGA GGGTGCGGCT GATCGCTCAT ATGGAATTCA TGTGGCTCAA ATGGCGGGCT TGCCCCCAGC CGTGATTAAA CGTGCTACCG AAGTGCTGAG CGAGCTTGAA GGTAAGGGTG ATCGGGAGCA GCGCCGCGAG GCCATGCGTC GCATGAACGC AGCAGGCAGT TCGGCTGTGC CCCAAATGTC GCTATTTGCG AGCAACGAGC CAAATCCAGC GGTTGAGCTA TTGCGCGAAA TGGATGTAAC CCAACTAACC CCAATCGAAG CCCTAACCAA ACTTTACGAA TTACAACGTT TGGCTAAAGT GGAGTGA
|
Protein sequence | MSKMTMWQQY LSIKQKYADV ILFFRLGDFY ETFGDDAKLI AEVLDITLTV RGLSSDENTP MAGVPYHAAD NYIEQLVSRG YRVAICEQMD EMVHKTLQKR EVVRIVTPGT LTEPTMLQAE RNSYLAAILV DRGNVGLAYA DLTTGEFCAT ELRGNEALKQ LEGELARLGA AELLVSDAPE LRPAGMEIAK KQLAQDLAPM RKAERERLLP HERTAKKVEG NNESTWVQGN VTQWPNWHWD ARTARDALLN QFKSQSLDGF GLGNKALATR AAGALIQYLH ETQRDSVAQV RSLRVYDTTR FMFLDPQTRR NLELTEGAGG QRKGSLIAVL DQTRTPMGAR LLRQWISQPL IELGPLTERQ QAVSCFVEET LVRGELRALF KGVGDIERTI NRVVQGIATP RDLVRLREAL RLTPDILSQI ERTGLRSTSP TEAAPSDDDL FDDEPTSNQI DACADICELL EQAIADDPPA LLGTWDNARS DENVIRKGHA AEIDAIVEAT RDAARWINEL EAKEQQRTGI KTLKVSYNKV FGYYIEVTKA SGETRIPDDY IRKQTLVNAE RYITPELKEY ESLILNASEA LNEKERQAFR LILRHLANAG NRLLDLARAI AEFDVYSTLA EVAVRQRFVR PTLRLDDVFV IQGGRHPVVE HNLNEPFTPN DAHFDADHQI IVLTGPNMSG KSTFLRQVAL IGLMAQIGSF VPADYAEIGL LDRIFTRIGA QDDIATGQST FMVEMIETAN ILHNGSPRSL IILDEIGRGT STYDGLSIAR AVVEYIHNQP RLRAKTLFAT HYHELTELAN ILPRVHNWTL AVAEEGDHVV FLRKVIEGAA DRSYGIHVAQ MAGLPPAVIK RATEVLSELE GKGDREQRRE AMRRMNAAGS SAVPQMSLFA SNEPNPAVEL LREMDVTQLT PIEALTKLYE LQRLAKVE
|
| |