Gene NATL1_21131 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagNATL1_21131 
Symbol 
ID4781075 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameProchlorococcus marinus str. NATL1A 
KingdomBacteria 
Replicon accessionNC_008819 
Strand
Start bp1766140 
End bp1768920 
Gene Length2781 bp 
Protein Length926 aa 
Translation table11 
GC content34% 
IMG OID640085410 
ProductDNA mismatch repair protein MutS 
Protein accessionYP_001015933 
Protein GI124026818 
COG category[L] Replication, recombination and repair 
COG ID[COG0249] Mismatch repair ATPase (MutS family) 
TIGRFAM ID[TIGR01070] DNA mismatch repair protein MutS 


Plasmid Coverage information

Num covering plasmid clones
Plasmid unclonability p-value0.0652427 
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones22 
Fosmid unclonability p-value
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGGCTGCCA GCCCAAACCC ACTACAAGGC AGTCTATTTG AGGAAAGCAA GCAAAGCACT 
ACTAATGGAG GGAAAGAAAC CAATAATTCA ATAGGATCTT CAGAAAATCT TTCAAATCAA
CAACTAAAAA GCGATGCATC ACTAAGACCT CGTATCAGAA AAACGTCTAA AAATCCAAAT
CAAATAAATG ATCTAGATCA GTTATCTAAT GCGGAAATTG AAGAGCCAAA GTGGTCACAT
CACAATTTGC CAAAAATTGA TGATCTCACT CCTGCTCTAA AACATTATGT GCAATTAAAA
ATAGAGAATC CTGATCGAGT TCTGCTGTAT AGGCTTGGAG ACTTCTTTGA ATGTTTTTTT
GAAGATGCAA TAACACTTTC ACAGTTGCTG GAAATCACAC TCACAAGTAA AGAAGGTGGT
AAAAAAATCG GGAAAGTCCC TATGGCTGGA ATCCCTCACC ATGCATCTGA TCGTTATTGC
ACAGAACTTA TTAAAAAAGG GCTATCAATT GCTATTTGTG ATCAACTTGA AGCTGCTCCA
TCAAAAGGTA ATAAATTAAT AAAAAGAGGG ATAACTAGAT TAATAACCCC TGGAACAATT
ATAGAAGAGG GAATGCTGAG TGCTAAACAA AATAATTGGC TGGCTTCTGT TTTACTAGAA
TCCGACTCGA ACTCAGATTT TGTTAACTGG TCTTTAGCAA AGCTAGATGT AAGCACCGGT
GAATTTCTTG TGCAAGAAGG TAAAGAGACA AATAATTTGC GTCAAGAATT AATCAAACTT
AAAGCAGCTG AAGTAATATC AGAAAGCAAA TCAATCTCCA ATCAAAACTG GTATAAAGGA
TTAATAGAAA TAACCGAATT TAATCAAACA TCATTTTCTA GATTAGAAGC AAAGACAACT
ATTGAAAATC ATTATTTTCT CAATAATATT GATGGATTAG GTATTCATCC AGAATCTCTA
TCAATTAGAA CAATTGGAGG ATTAATAGCA TATTTAAATA AAACTCATCC AAATATTGGT
AATAACTTGA AAAATGAAGT AAAGACAAAT ATATGTATTG ATTTTCCTCA AATAAAACAT
AATCAGGCGG GGCTAATTAT AGATAATCAA ACAAGAAGAA ATTTAGAAAT CACATCAACT
CAAAAAAATG GTCAATTTCA GGGTTCATTA CTATGGGCAA TTGATAAAAC ATTAACTGCA
ATGGGTGGGA GATGTATTAG GAGGTGGTTA GAAGAGCCTT TAACAGAAAT TTATTCGATA
CAAAGCAGAC AAAAAATAAT TGGTTTACTC GTTGAATCAT CAAGCTTAAG AAAAAATATC
CGAAAAATAT TGAGAGCTAT GGGCGATTTA GAACGTCTAT CAGGCAGGGC AGGAGCCCAA
CAAGCAGGAG CACGTGATTT AATTGCTATC GCAGAAGGTA TTAACCGTTT ACCCTTAATT
AAAAAATATC TTAATGACCC AATATTTGAA GAAACAAAGT ATTTCGAATC TATTATAAAT
TTAGATAGGG ACTTAATAGA ACTTGCTTCA AAAATAAATA ATGAGATTAT AGATAATCCT
CCTCTTAGCC TTACAGAAGG TGGTCTTATT TTTGATGGTG TAAATCCTAT ACTTGATGGA
CTTAGGAATC AACTTGATGA TCATAACTCA TGGTTAAAAT CTCAAGAAAT AGAAGAAAGG
AAGAACAGCA ATATTAATAA TTTAAAGCTT CAGTATCATC GTTCGTTTGG ATACTTTTTA
GCAGTAAGTA AAGCAAAATC TATCAATGTT CCAGATCACT GGATAAGAAG ACAAACATTA
ACTAATGAAG AACGCTTTGT TACACCAGAA CTAAAAGAAC GAGAAGGAAA GATTTTCCAA
CTTAGAGCAC GTATATCACA ACTTGAATAT GATCTTTTTT GTAAGCTTAG AATTCTTGTT
GGGAATAAGT CAGATATTAT TAGAAAAGCT GCAAAAGCGA TTTCATGCTT AGATGTTTTA
TCTGGCTTAG CCGAATTAGC TGCTACAAAT AACTACATTC AACCTAAAAT TATTGACAAT
AAAGATTCTA CCAAAACAAG AAGGTTATCT ATTGTTGATG GTCGTCATCC TGTAGTTGAA
CAAATTCTTG TTGATAAATT TTTCGTCCCT AATGATATTG AACTTGGCTC TAAGACCGAT
CTAATTATTC TTTCAGGGCC AAACGCAAGT GGCAAAAGTT GCTATTTAAG ACAAGTAGGT
CTCTTACAAA TCATGGCTCA GATCGGTAGT TGGATCCCAG CTAAATCAGC AAATATTGGA
ATCGCTGATC AATTATTTAC ACGTGTTGGA GCAGTAGATG ATTTAGCTTC AGGCCAATCA
ACTTTCATGG TGGAAATGAT TGAAACTGCC TTCATTCTTA ATAATGCTAC TGAAAACTCA
TTAGTTTTAT TAGATGAAAT TGGAAGAGGA ACTTCAACTT TTGATGGGTT ATCTATTGCC
TGGTCAGTAA GCGAGTTTCT AGCAAAAAAA ATTAAAAGTC GTTCAATCTT TGCAACTCAT
TACCATGAAT TGAATCAAAT TTCTGAATAT ATTGAAAATG TCGAGAATTA CAAAGTTGTA
GTTGAATATA AAAATCATTC CCTTTCATTC CTTCACAAGG TCGAAAAAGG AGGAGCAAAT
AAAAGTTATG GAATTGAAGC TGCGAGGCTT GCGGGAGTCC CCCCAGACGT AGTCAATAAT
GCAAGATTGA TATTAAAAAA TCTAGAAAAA AATAACTCCA ACACCATTCA AATCACTAAG
CCAATTGAAA GTTGCAAATA A
 
Protein sequence
MAASPNPLQG SLFEESKQST TNGGKETNNS IGSSENLSNQ QLKSDASLRP RIRKTSKNPN 
QINDLDQLSN AEIEEPKWSH HNLPKIDDLT PALKHYVQLK IENPDRVLLY RLGDFFECFF
EDAITLSQLL EITLTSKEGG KKIGKVPMAG IPHHASDRYC TELIKKGLSI AICDQLEAAP
SKGNKLIKRG ITRLITPGTI IEEGMLSAKQ NNWLASVLLE SDSNSDFVNW SLAKLDVSTG
EFLVQEGKET NNLRQELIKL KAAEVISESK SISNQNWYKG LIEITEFNQT SFSRLEAKTT
IENHYFLNNI DGLGIHPESL SIRTIGGLIA YLNKTHPNIG NNLKNEVKTN ICIDFPQIKH
NQAGLIIDNQ TRRNLEITST QKNGQFQGSL LWAIDKTLTA MGGRCIRRWL EEPLTEIYSI
QSRQKIIGLL VESSSLRKNI RKILRAMGDL ERLSGRAGAQ QAGARDLIAI AEGINRLPLI
KKYLNDPIFE ETKYFESIIN LDRDLIELAS KINNEIIDNP PLSLTEGGLI FDGVNPILDG
LRNQLDDHNS WLKSQEIEER KNSNINNLKL QYHRSFGYFL AVSKAKSINV PDHWIRRQTL
TNEERFVTPE LKEREGKIFQ LRARISQLEY DLFCKLRILV GNKSDIIRKA AKAISCLDVL
SGLAELAATN NYIQPKIIDN KDSTKTRRLS IVDGRHPVVE QILVDKFFVP NDIELGSKTD
LIILSGPNAS GKSCYLRQVG LLQIMAQIGS WIPAKSANIG IADQLFTRVG AVDDLASGQS
TFMVEMIETA FILNNATENS LVLLDEIGRG TSTFDGLSIA WSVSEFLAKK IKSRSIFATH
YHELNQISEY IENVENYKVV VEYKNHSLSF LHKVEKGGAN KSYGIEAARL AGVPPDVVNN
ARLILKNLEK NNSNTIQITK PIESCK