Gene Haur_4078 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagHaur_4078 
Symbol 
ID5735936 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameHerpetosiphon aurantiacus ATCC 23779 
KingdomBacteria 
Replicon accessionNC_009972 
Strand
Start bp5208304 
End bp5211090 
Gene Length2787 bp 
Protein Length928 aa 
Translation table11 
GC content52% 
IMG OID641281229 
ProductDNA mismatch repair protein MutS 
Protein accessionYP_001546838 
Protein GI159900591 
COG category[L] Replication, recombination and repair 
COG ID[COG0249] Mismatch repair ATPase (MutS family) 
TIGRFAM ID[TIGR01070] DNA mismatch repair protein MutS 


Plasmid Coverage information

Num covering plasmid clones
Plasmid unclonability p-value0.980246 
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clonesn/a 
Fosmid unclonability p-valuen/a 
Fosmid Hitchhikern/a 
Fosmid clonabilityn/a 
 

Sequence

Gene sequence
ATGTCAAAAA TGACGATGTG GCAGCAATAT CTGTCGATTA AGCAAAAATA TGCCGATGTG 
ATTCTGTTTT TCCGGCTAGG CGATTTTTAC GAAACTTTCG GCGATGATGC CAAGTTGATT
GCCGAGGTGC TTGATATTAC CCTGACGGTG CGTGGCCTCA GCAGCGATGA AAATACGCCG
ATGGCGGGTG TGCCCTATCA CGCCGCCGAT AATTATATTG AGCAACTGGT CAGTCGGGGC
TATCGCGTGG CAATCTGCGA ACAAATGGAT GAGATGGTGC ACAAAACCTT GCAAAAACGT
GAGGTTGTGC GGATTGTCAC GCCAGGCACT CTGACCGAGC CAACCATGCT CCAAGCTGAA
CGCAATAGCT ACCTTGCGGC AATTTTGGTT GATCGTGGCA ATGTTGGTTT GGCGTATGCC
GACCTAACCA CCGGCGAGTT TTGTGCCACC GAATTGCGCG GCAACGAGGC TTTGAAGCAA
CTTGAAGGCG AATTAGCGCG TTTGGGTGCG GCTGAACTCT TGGTTTCCGA TGCTCCTGAG
TTGCGTCCAG CCGGCATGGA AATTGCCAAA AAGCAGTTAG CCCAAGATCT TGCACCAATG
CGCAAGGCTG AACGCGAACG CTTGCTACCC CACGAACGCA CCGCCAAAAA AGTTGAAGGT
AATAACGAAA GCACGTGGGT TCAAGGCAAT GTCACCCAAT GGCCTAATTG GCATTGGGAT
GCCCGCACCG CTCGCGATGC CTTGCTCAAT CAATTTAAAA GCCAATCGCT TGATGGCTTT
GGCTTGGGCA ATAAAGCCCT CGCAACCCGC GCCGCAGGAG CATTAATTCA ATATTTGCAT
GAAACGCAGC GCGATAGTGT GGCCCAAGTT CGCAGCTTGC GGGTCTATGA TACAACTCGT
TTTATGTTTC TCGACCCCCA AACGCGGCGT AATTTGGAGC TAACCGAAGG TGCTGGTGGC
CAACGCAAAG GCTCGTTGAT TGCGGTGCTC GACCAAACCC GCACGCCGAT GGGTGCACGC
CTGTTACGCC AATGGATCTC ACAACCGCTG ATTGAGCTTG GCCCACTGAC CGAGCGCCAA
CAAGCCGTCA GTTGTTTTGT TGAAGAAACC TTGGTCCGCG GCGAGTTACG GGCCTTATTC
AAAGGGGTTG GCGATATCGA ACGCACAATC AATCGGGTGG TGCAAGGCAT TGCCACCCCG
CGTGATTTAG TGCGATTGCG CGAAGCGCTG CGCCTAACTC CCGATATTTT GAGCCAAATC
GAGCGTACAG GTTTGCGCTC AACCAGCCCA ACCGAGGCTG CGCCAAGTGA TGATGATCTG
TTTGATGACG AGCCAACAAG CAATCAAATC GATGCTTGTG CCGATATTTG CGAATTACTC
GAACAGGCGA TTGCCGATGA TCCGCCCGCC TTGCTTGGCA CATGGGATAA CGCCCGCAGC
GACGAAAATG TGATTCGCAA GGGTCATGCT GCCGAAATTG ATGCAATTGT TGAGGCTACT
CGCGATGCCG CCCGTTGGAT CAACGAACTT GAAGCCAAGG AACAGCAACG CACTGGCATC
AAAACGCTCA AAGTCAGCTA TAACAAAGTC TTTGGCTATT ACATCGAGGT AACCAAGGCC
AGCGGCGAAA CCCGCATTCC CGATGATTAC ATTCGCAAAC AAACCTTGGT CAATGCCGAG
CGCTATATCA CGCCAGAACT CAAAGAATAT GAATCGCTGA TTTTGAATGC CTCAGAAGCC
TTGAACGAAA AAGAGCGCCA AGCATTTCGC CTGATTTTGC GCCATTTGGC CAACGCTGGC
AATCGTTTGC TCGATTTAGC GCGAGCAATC GCCGAGTTTG ATGTCTATAG CACCTTGGCC
GAGGTGGCGG TGCGTCAGCG TTTTGTGCGG CCAACCTTGC GCCTCGACGA TGTATTTGTG
ATTCAAGGCG GGCGACATCC TGTGGTTGAG CACAATTTGA ACGAGCCATT TACCCCGAAT
GATGCTCATT TTGATGCTGA CCATCAGATT ATTGTGCTGA CTGGACCAAA CATGTCGGGC
AAAAGCACCT TTTTGCGCCA AGTGGCTTTG ATTGGCTTAA TGGCCCAAAT CGGCTCGTTT
GTGCCCGCTG ATTATGCCGA AATTGGCCTA CTCGACCGAA TTTTCACGCG GATTGGGGCA
CAAGACGATA TTGCTACCGG CCAATCGACC TTTATGGTTG AAATGATCGA GACCGCCAAT
ATTTTACATA ATGGGTCACC ACGATCGCTG ATTATTCTCG ATGAAATTGG CCGTGGCACC
AGCACCTACG ACGGGCTTTC GATTGCCCGC GCTGTGGTCG AATATATTCA TAATCAGCCG
CGCTTACGAG CCAAAACCCT GTTTGCAACC CACTACCACG AACTGACCGA GCTGGCTAAC
ATCTTGCCAC GGGTGCATAA TTGGACGTTG GCGGTGGCCG AAGAAGGCGA TCATGTAGTG
TTTTTGCGCA AAGTGATCGA GGGTGCGGCT GATCGCTCAT ATGGAATTCA TGTGGCTCAA
ATGGCGGGCT TGCCCCCAGC CGTGATTAAA CGTGCTACCG AAGTGCTGAG CGAGCTTGAA
GGTAAGGGTG ATCGGGAGCA GCGCCGCGAG GCCATGCGTC GCATGAACGC AGCAGGCAGT
TCGGCTGTGC CCCAAATGTC GCTATTTGCG AGCAACGAGC CAAATCCAGC GGTTGAGCTA
TTGCGCGAAA TGGATGTAAC CCAACTAACC CCAATCGAAG CCCTAACCAA ACTTTACGAA
TTACAACGTT TGGCTAAAGT GGAGTGA
 
Protein sequence
MSKMTMWQQY LSIKQKYADV ILFFRLGDFY ETFGDDAKLI AEVLDITLTV RGLSSDENTP 
MAGVPYHAAD NYIEQLVSRG YRVAICEQMD EMVHKTLQKR EVVRIVTPGT LTEPTMLQAE
RNSYLAAILV DRGNVGLAYA DLTTGEFCAT ELRGNEALKQ LEGELARLGA AELLVSDAPE
LRPAGMEIAK KQLAQDLAPM RKAERERLLP HERTAKKVEG NNESTWVQGN VTQWPNWHWD
ARTARDALLN QFKSQSLDGF GLGNKALATR AAGALIQYLH ETQRDSVAQV RSLRVYDTTR
FMFLDPQTRR NLELTEGAGG QRKGSLIAVL DQTRTPMGAR LLRQWISQPL IELGPLTERQ
QAVSCFVEET LVRGELRALF KGVGDIERTI NRVVQGIATP RDLVRLREAL RLTPDILSQI
ERTGLRSTSP TEAAPSDDDL FDDEPTSNQI DACADICELL EQAIADDPPA LLGTWDNARS
DENVIRKGHA AEIDAIVEAT RDAARWINEL EAKEQQRTGI KTLKVSYNKV FGYYIEVTKA
SGETRIPDDY IRKQTLVNAE RYITPELKEY ESLILNASEA LNEKERQAFR LILRHLANAG
NRLLDLARAI AEFDVYSTLA EVAVRQRFVR PTLRLDDVFV IQGGRHPVVE HNLNEPFTPN
DAHFDADHQI IVLTGPNMSG KSTFLRQVAL IGLMAQIGSF VPADYAEIGL LDRIFTRIGA
QDDIATGQST FMVEMIETAN ILHNGSPRSL IILDEIGRGT STYDGLSIAR AVVEYIHNQP
RLRAKTLFAT HYHELTELAN ILPRVHNWTL AVAEEGDHVV FLRKVIEGAA DRSYGIHVAQ
MAGLPPAVIK RATEVLSELE GKGDREQRRE AMRRMNAAGS SAVPQMSLFA SNEPNPAVEL
LREMDVTQLT PIEALTKLYE LQRLAKVE