Gene EcSMS35_4024 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagEcSMS35_4024 
Symbol 
ID6144934 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameEscherichia coli SMS-3-5 
KingdomBacteria 
Replicon accessionNC_010498 
Strand
Start bp4104190 
End bp4112991 
Gene Length8802 bp 
Protein Length2933 aa 
Translation table11 
GC content54% 
IMG OID641618849 
Productputative invasin 
Protein accessionYP_001745987 
Protein GI170681141 
COG category 
COG ID 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clones
Plasmid unclonability p-value0.43506 
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones35 
Fosmid unclonability p-value0.407201 
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGGCAGGGA AAGCGCATGG GAATGGCGAT CGTCGGGGCG ATAACACCAT ATGTGGATTG 
GGGGATCGTT TACGTCGTCT GACCGCCGGT ATTTGCCTGA TAACACAAAC TATTTTCCCT
GTTATGGCTG CTGCCCCTAC GCATATTAAT TCTGCGCACT CGGATACTGC CGCGTCACTG
ATCCTGCCGA ACGTTAAAAC TATTCCATAT ACCCTGGGTG CGTTGGAATC GCCACCAACG
GTCGCGGCAC GCTTTGGTAT CACCGTTGAT GAATTACGTC GCCTGAATCA GTTCCGTACC
TTTGCGCGGG GCTTTGACAA TGTGCGTCAG GGCGATGAAA TAGATGTGCC CCTCATTAAC
AGTAATAGCC CTGAAGCGCG CAACCTGAAA GCAATGCAGA TGGAGCGCGA CGGCAAAGAT
CCGCAAATGC AGGTCGCGGA AATGGCGCAG CAAAGCGGAA CACTTTTAGC ACGCGATATG
GACAGTGAAC AGGCGGCGAG TATGGCGCGT GGCTGGGTGG CTTCTTCTGC CTCTGCGCAG
GCAACGGACT GGCTCAGTCG TTGGGGAACT GCCCGGGTAT CTCTTGGTGT GGATGAAGAT
TTCAGCCTTA AAAGCTCGTC ATTTGAATTT CTCCATCCCT GGTACGAAAC GCCGGATAAC
CTGGTCTTCA GCCAGCACAC ACTTCACCGT ACGGATGATC GCACTCAAAC CAACCACGGT
ATCGGCTGGC GTTATTTCAC ATCGAGCTGG ATGTCTGGCG TCAATATGTT TATCGATCAT
GACCTGACCC GCTATCACAC CCGAACCGGG ATGGGCGTTG AATACTGGCG CGATTACCTG
AAACTGAGCG GCAACGGTTA TCTGCGCCTC AGTAACTGGC GTAGTGCACC GGAACTCGAC
AATGACTATG AAGCACGTCC GGCGAACGGG TGGGATCTTC GCGCCGAAGG CTGGTTACCG
GCCTGGCCGC AGCTGGGCGG CAAGCTGGTC TATGAACAAT ATTATGGCGA TGAAGTGGCG
CTGTTTGGTA AAGACGAGCG ACAAAATGAT CCTCACGCGA TTACCGCAGG TTTGAGTTAC
ACCCCCGTTC CCCTGATCTC CTTTAGTGCC GAGCAACGGC AGGGCAAACA GGGAGAAAAC
GACACCCGAA TTGGTATGGA GTTGACGTTG CAACCCGGCC ATTCACTGCA AAAACAGCTC
GATCCGGCTG AAGTTGCAGC ACGACGCAGC CTTGTGGGCA GTCGCTATGA CCTGGTCGAT
CGTAATAACA ATATTGTCCT CGAATATCGC AAGAAAGAGC TGGTGCGGCT GACGCTGACC
GATCCGTTGA AAGGCAAGCC GGGTGAGGTT AAATCGCTGG TTTCTTCCTT GCAAACCAAA
TATGCGCTGA AAGGTTATGA CATTGAAGCC GCCTCCCTGC AATCAGCGGG CGGAAAAGTA
GCTGTTTCCG GCAAGGATAT TCAGGTCACT ATTCCTCCCT ACCGATTCAC GGCAATGCCT
GAAACCGACA ACACTTACCC GATAGCGGTT ACAGCGGAAG ATTCGAAAGG AAATTTCTCC
CGGCGTGAAG AAAGTATGGT GGTCGTGGAA AAACCCACAC TAAGCCTCAC CGATTCAACG
CTTTCTGTCG ATCAGCAAAT CCTGCTTGCT GATGGAAAAT CAACCTCAAC GCTGACCTAT
ACCGCGCGAG ATTCCAGCGG CAAGCCAATC CCAGGTATGA CGTTAAAAAC ACAGGTGAAA
GGTTTGCAGG ATTTCGCCTT AAGCGAATGG AAAGACAACG GTAACGGAAC CTATACCCAA
ATAGTTACCG CCGGGAAGAC GTCTGGCGCG TTATCGCTGA TGCCGCAATT TAACGGCGAC
GATATCGCTA AGACGCCAGC GCTGATTGCT ATCGTGGCGA ATACGGCATC CCGCGCTGAC
TCCACGATTG AGACAGACCA GGATAATTAC GTCGCGGGTA AGCCGATAGT CGTTAAAGTG
ACCCTGAGAG ATGATAACGG CAACGGGGTT ACCGGACGAA AAGAACTTCT CAAACAAACC
GTTAAAGTTG ACAACACGAA GGCTGATGAC GTTAGTGCCT GGACGGAAGA AAGTGAGGGA
ATTTACAAAG CAAGCTATAC GGCCCATCTC ATTGGCGACA AATTAACGGC CCAGCTGACG
ATGCCCGGCT GGCAAACCAA ACATTCGGAT GCGTTCAGTA TTGCGGGTGA TAAAGATACC
GCCAAAATCG CTGCGATGCA GATCACCGCG AATAACGCAG TTGCCAGACG CGATCACAAT
ACCGTCGCTG TGACGGTCAG GGATGTGCAT CAAAATTTAC TGCAGGGACA AAATGTTACC
TTCACGGTCG TGAATGGTGC GGCGGTTTTT GCCGATCCAA ATGGCGGTAT TGTGACGACA
GATAAAGACG GGATTGCCAG CGTTAACCTT GCCAGTGACC AGGCGGTAAA CAGTCTTATC
AAGGCCGAGA TCAATGGCAG TAGTCAGTCT GTGGAAGTGA GCTTTATCAC TGGCGATATC
TCGCAGCTGA CCTCCACCAT CAAGACGGAC GATGTGTCAT ACACCGCGGG TGGCAAGATA
AAAGTGTCAG TTACCCTTAT GGATGAGCAG AAAAACCTGG TGAAAGGGAT GGCTTCGCTG
CTGGCGGGTA GCAGTGTGGT CGAGGTGAGC GGTACAGATA AAAATGAAAC GGGTAACTGG
AGCGAAGAGT CTGATGGCGT TTACACCACC ACCCGAACCG CGAAGATTGC TGGCGATCGC
CATTACGCCA CGCTGAAACT ATCGACATGG AGTAGCGCGC AGCAATCCGA TGCGTATGCC
ATCCGCGAAA GCGGAGCTGT GTTGGCTTAT TCCTCCATCG TTACCGACAA AACGGCCTAC
ACCGCGGGCG GCGCGATAAA AGTTACGGTA ACACTCAAGG ATAGCTATGA AAACCTGGTC
GGGGGGCAAC GAGACGCTAT CAACCTGGCT ATCCAGCTGC CAAATACAAA AGCGGAAAGC
ATTGCCTGGA ATGAAGATCA AAAAGGAATT TATACCGCAA CGTATACCGC CTTACTCCCT
GGCACCGGTT TAAAAGCACA GCTACAGATG TCCGGCTGGG CGAACGCTCT GACTTCAAAT
GATTACAGTA TTAGTGGCGA TGCCGCCAGC GCACAAATTG TTGCGATGCA AGTGACAACG
GGTAACCCGG ATGTGCTGGC GAATGGCAGC GATCGACATA CGGTGAATGT CCGGGTGGAA
GATCAGTTTG GCAATGTGCT GTCGGAGCAG ACCGTCACCT TTACGGTGAC GAAGGGGGCG
GCTGTATTTG CCAATGCAGG GCAGAGCGCA GATATCAGGA CGGATGCGCA TGGCATGGCA
GAGGTTGATC TGAGCAGTAC CGTAGCGGAT GCCAGCACTG TTGAGGCTAA GATCAATCAA
AGCAGCGATA GTAAAACGGT GAACTTTGTT GCTGATGTTA GCACGGCGCA AGTAGCGGAG
CTGGTGGTGA CGCAGGACGG CTCAGTGGCT GATGGTTCGA CGGCGAACAC GCTGCGGGCG
AGGGTCACCG ATGTGTTCGG GAATGCACTT GCCGGGCAGA CGGTTAGCGT GTTGGCAGGC
AACGGCGCAA CGACTGCTCC GACGGTCACC ACGCAGCCGG ATGGCACGGT GGAGATCAGT
GTCACCAGCC AGACCGCCGG AACCAGTGTG ATCACCGCCA GCGTCAACAA CAGCAGCCAG
AGCCGGGATG TGACGTTTAT CGCCGATGTC AGGACGGCGC AGATCGCTGA TCTGGTGGTC
ATTAAGGACG GTTCAGAGGC AGACGGTGCG ACGGCGAACA CGCTGCGGGC GAGGGTAACC
GATGCATTCG GTAATGCGCT TGCCGGGCAG ACGGTCAGCG TGTTGGCGGA CAACGGCGCA
ACGGTCGCTC CGGTGGTCAC CACGCAGCCG GACGGCACGG TGGAAATCAG TGTCACCAGC
CAGACGGCGG GCAGCAGTGC GGTCACCGTC AGTATCAACA GCAGCAGTCA GAGCCGGGAC
GTGACATTTA TCGCGGATGT CAGAACGGCG CAGATCGCGG ACCTGGTGGT CATTAAGGAC
GACTCAGTGG CAGACGGTGC AATGGCGAAT ATGCTGCGGG CGAGGGTCTC CGATGTGTTC
GGTAATGCGC TTGCCGGGCA GACGGTCAGC GTGATGGCGG ACAACGGTGC AGCGGTCGCT
TCGACGATGA CCACAAAGCC GGACGGCACG GTGGAGATCA GTGTCACCAG CCAGACGGCA
GGAATCAGTG TGGTCACCGC CAGCATCAAC AACAGCAGCC AGAGTCAGAA CGTGACGTTT
GTCGCGGATG TCAGGACGGC GAAGATAGCG GATCTGGTCG TTAGCCAGGA TAACGCGGTG
GCAGACGGTT CGACGGCGAA CACGCTGCGG GCGAGGGTGA CCGATGCGTT CGGTAATACG
CTTGCCGGGC AGACGGTCAG CGTGATGGCA GGCAACGGCG CAACGGTTGC CCCAACGGTG
ATTACAGAGC CGGACGGCAC GGCGGAGATC AGTGTCACCA GCCAGACGGC GGGAGTCAGT
GCAGTCACCG CCAGCATCAA CAACAGCAGC CAGAGCCGGG ATGTGACGTT TATCGCCGAT
ATCAGGACGG CGCAGATCGC GAGTCTGGAG GTGACGCAGG ATAACGCCGT GGCTGACGGC
GCAATGGCGA ATACGCTGCA GGTGAGGGTC ACCGATGCTA ACGGTAATAC GCTTGCCGGG
CAGGCGGTCA GCGTGATGGC AGGTAACGGC GCAACGGTCG CTCCGGCGGT CACCACTCAG
CCGGATGGCA CGGTGGAGAT CCCTGTCACC AGTCAGACGG CGGGAGCCAG TGCGGTCACC
GCCAGTATCA ACAACAGCAG CCTGAGCCGG GATGTGACGT TTATTGCGGA TGTCAGGACG
GCGCAGATCG CGGAGCTGGT GGTCATTAAG GACGGCTCAG CGGCAGACGG TGCGACGGCG
AATACGCTGC AGGCGAGGGT CACCGACGCG TTCGGGAATG CGCTTGCCGG ACAGACGGTC
AGCGTGTTGG CAGACAACAG CGCAACGGTT GCTCCGGCGG TCATCACTGA GCCGGACGGT
ACGGTGGATA TCTCAGTCAC CAGCCAGACG GCAGGGATCA GTACAGTCAC AGCAACCATC
AACAACCATA GCCTGAGCCA GAGCGTGATG TTTATCGCCG ATGTCAGGAC GGCGCAGATC
GCTGATCTGG TGGTCATTAA GGATGGTTCA GAGGCGGACG GTGCGACGGC GAACACGCTG
CGGGCGAGGG TGACCGATGC GTTCGGCAAT GCACTTGCCG GGCAGACGGT CAGCGTGTTG
GCGGATAACG GCGCAACGGT CGCTCCGACG GTCATCACAG GGCAGGACGG CACGGTGGAA
ATCAGTGTCA CCAGCCAGAC GGCGGGGATC AGTACAGTCA CTGCAACCAT CAACAGCAGT
AGTCAGAGTC AGAACGTGAC ATTTATCGCC GATGTCAGGA CGGCGCAGAT CGCGGAGCTG
GTGGTCATTA AGGACGGCTC AGCGGCAGAC GGTGTAATGG CGAATATGTT GCGGGCGAGG
GTCACCGATG CGTTCGGTAA TGCGCTTGCC GGACAGACGG TCAGCGTGTT GGCAGGCAAC
GGTGCCACGA CTGCTCCGAC GGTCACCACG CAGCCGGACG GGACGGTGGA GATCTCTGTC
ACCAGCCAGA CCGCCGGAAT CAGTGCGGTC ACCGCCAGCA TCAACAACAG CAGCCAGAGC
CGAAACGTGA CGTTTATCGC CGATGTCAGG ACGGCGCAGA TCGCGGATCT GGTGGTCATT
AAGGACGGTT CAGAGGCGGA CGGTGCGACG GCGAACACGC TACGGGCGAG GGTGACCGAT
GCATTCGGTA ATGCGCTTGC CGGGCAGACG GTCAGCGTGA CGGCAGGCAA TGGCGCAACG
GTCGCTCCGA CGGTCACCAC GCAACCGGAC GGCACGGCGG AGATCAGTGT CACCAGCCAG
ACGGCGGGAG TCAGTGCAGT CACCGCCAGC ATCAACAACA GCAGCCAGAG CCGGGATGTG
ACGTTTATCG CCGATATCAG CACGGCGCAG ATCGCGGATC TGGTGGTCAT TAAGGACGGT
TCAGAGGCAG ACGGTGCGAC GGCGAACACG CTGCGGGCGA GGGTAACCGA TGCATTCGGT
AATGCGCTTG CCGGGCAGAC GGTCAGCGTG ATGGCAGGTA ACGGCGCAAC GGTCGCTCCG
GCGGTCACCA CTCAGCCGGA TGGCACGGTG GAGATCAGTG TCACCAGCCA GACGGCGGGA
ATCAGTGCAG TCACTGCCAG CATCAACAGC AGCAGTCAGA GCCGGGACGT GACATTTATC
GCCGATGTCA GGACAGCGAA GATCGCTGAA CTGGAGGTCA TCCGGGATAA CGCGGTGGCA
GACGGTTCGA CGGCGAATAC GCTGCAGGTG AAGGTCACCG ATGCTAACGG TAATACGCTT
GCCGGACAGA CGGTCAGCGT GTTGGCGGGC AACAGCGCAA CGGTTGCTTC GACGGTGACC
ACAAAGCCGG ACGGCACGGT GGAGATCAGT GTCACCAGCC AGACGGCGGG CACCAGTACA
GTCTCCGCCA GCATCAACAA CAGCAGCCAG AGCCAGAACG TCACCTTTGT ACCTGGTGAT
GCATCGCAGC TCACTTCAAC CGTTGAAACA AATAAGTCGA ACTATACGGT TGGAGAGACA
ATCACCATCA CGGTAACGCT CCGGGATGCG TTCGATAATC TGGTAACCGG TGCAGCTTCA
CAGTTAGCCG CTGATGGTGT GCTGACGGTG GCTGGCACTG ACCCGTCAGA AACGGGAAGC
TGGGTTGAAT CAGGTGGCGT GTATACCACA ACCCGAATGG CGACGATTGC CAGCACCAAT
CAGCACGCCA ACCTGCAATT GCAGACGTGG AGTGATGGCG TAACGTCAGA CCGCTATGAC
ATACAGTCCG GTTCTCCGGC GCAAGCCACA TCAACTATCG CTACGGATAA AAACGCCTAC
ACCGCTGGTG ACACTATAAC CGTCGCTGTG ACGCTGAAAG ACGCGCACGG TAACCTGGTT
GAAGGGGGCG AGTCCTTATT GTCTGGTGAT AACGTAACAG TAGAAGGAGC CGTACGTTCA
GGAGGATGGT CTGAAACCGC CGGTGTTTAT ACTGCTACAT GGTCGGCGCA AATGGCGGGA
GACTCTCACC ACGCGACGCT GAAGTTGTCT GAATGGGGCA GCAGTAAACA ATCAGAGAGT
TATTCCATCC ACAGCGGTGC TCCGGTGCAG GCAAATTCTG CTATCAGGAC AGATAAATCG
GCATATATCG CCGGGGAGCC ATTAACCGTC ACAATAACGT TGCGGGATGA GTTTGGTAAC
CCTGCTTTAG GGCTGACATC AGAAGTCATC GAGTCTTATA TTGATAGTTT CGCGGTAGGC
GGCGCAACTC CCGATTCTAT GCGGTGGGTC GAGCAGAATA ATGGTGAATA TACCATTGTC
TGGACAGCAT GGATTGCCGA GGAAAATCTG GTTGCCAGTC TGAAATTAAA AACATGGGCA
GAGGAGATTA AGTCCTCGCT ATACGGGATA CAACCAGGTG CAGCGGCAAA AACTCAATCA
ACGATTGTCG CGGACAAAAC GATATATATC GCTGGGGATA GCATAACGGT GACCGTTGTT
CTTAAAGACG CCCAGGGTAA CTTTATTACT GATGGTGTCG TTCAGCTTAA TGAGGAAAAC
GTACAAGTCA GGAATGCGGA TCCTATCCAG GGCAATAATT GGGTTTACAA CGGCAATGGG
CAATATCAAA GGCAGTATAT GGCGCATTTT GCGGAAGCGA ATTTGAATGC GCAACTGAAG
ATGGCTGGTT GGAGCGATGC CAATTATTCA AACAATTACA CTATCAAACC AGGTGAAGTG
AGTCCGCTTG GTTCACAGCT TCGTATACGA GAAGTTTTAG TAGTGGAAGG GGCGGATTTA
CCTGTCAGTG TTTTATTAGT TGATGATTTT GGAAATCCGG TGGATAACGG TCTGGATTTG
CTGGATGATA CAGTGTACTT ACAAAATGTA GAAAAAAAAG AAGGGGAAAA ATGGAGATAT
GTGGGTGATG GCATATATGA ACGTACATAT ATGGCCTACC AAGAAGGAGA AAATTTAACT
TCATTTATGG AGATTAAAGG TTGGCGTATA TACGGACAAC CATCCTATAC CATCCTTCCT
TTTGTGGAAG TCGAGTTGCT GAGTGTTAAT GGTGTAAAAT TCAGGGCGAC AGATGGTTTT
CCAGAAACAG GGTTTGATGG TGCAAAATTC ACGCTGTTAC TAACCCATAA TATGAAAAAT
ACGGATTATA ATTGGACGGC TGGGATTTAT GGAATTAATG TTGACAGTAA TGGAGAAGTA
ACATTATCTG TGCTGATAAG AAGTGAAGTT ACTATTACCG GGAAACCCAA GAATGGTAAA
GGGAATGATG TTGTATTTAA ATTCAAGATT AAAAAATGGT TTACTAGTTT AGGTGCTACC
AGTAGTAATA CCTGGGATAT CATAAATACT TCATGTAGTT ATGGTCAAAT GCCGTCATCT
TTAGAATTAG CGCAGCGGCC AAGCGGTGGG GTAGTACCGC GAAAAGTAGG TACGTTGTGG
GGCGAATATG GTAATTTAAA AACTTATGGG AACGCTTTTA GCGGTACGGA TTATTGGACA
AGCACTCAGC TCATGGGAGT ACACGAGAAA TTTAATCCTG AAACTGGTAT ATCAGAGCTT
GGAACAGGAA AGTCCTCTGG CTTGTGCGTA GAGTATTATT AA
 
Protein sequence
MAGKAHGNGD RRGDNTICGL GDRLRRLTAG ICLITQTIFP VMAAAPTHIN SAHSDTAASL 
ILPNVKTIPY TLGALESPPT VAARFGITVD ELRRLNQFRT FARGFDNVRQ GDEIDVPLIN
SNSPEARNLK AMQMERDGKD PQMQVAEMAQ QSGTLLARDM DSEQAASMAR GWVASSASAQ
ATDWLSRWGT ARVSLGVDED FSLKSSSFEF LHPWYETPDN LVFSQHTLHR TDDRTQTNHG
IGWRYFTSSW MSGVNMFIDH DLTRYHTRTG MGVEYWRDYL KLSGNGYLRL SNWRSAPELD
NDYEARPANG WDLRAEGWLP AWPQLGGKLV YEQYYGDEVA LFGKDERQND PHAITAGLSY
TPVPLISFSA EQRQGKQGEN DTRIGMELTL QPGHSLQKQL DPAEVAARRS LVGSRYDLVD
RNNNIVLEYR KKELVRLTLT DPLKGKPGEV KSLVSSLQTK YALKGYDIEA ASLQSAGGKV
AVSGKDIQVT IPPYRFTAMP ETDNTYPIAV TAEDSKGNFS RREESMVVVE KPTLSLTDST
LSVDQQILLA DGKSTSTLTY TARDSSGKPI PGMTLKTQVK GLQDFALSEW KDNGNGTYTQ
IVTAGKTSGA LSLMPQFNGD DIAKTPALIA IVANTASRAD STIETDQDNY VAGKPIVVKV
TLRDDNGNGV TGRKELLKQT VKVDNTKADD VSAWTEESEG IYKASYTAHL IGDKLTAQLT
MPGWQTKHSD AFSIAGDKDT AKIAAMQITA NNAVARRDHN TVAVTVRDVH QNLLQGQNVT
FTVVNGAAVF ADPNGGIVTT DKDGIASVNL ASDQAVNSLI KAEINGSSQS VEVSFITGDI
SQLTSTIKTD DVSYTAGGKI KVSVTLMDEQ KNLVKGMASL LAGSSVVEVS GTDKNETGNW
SEESDGVYTT TRTAKIAGDR HYATLKLSTW SSAQQSDAYA IRESGAVLAY SSIVTDKTAY
TAGGAIKVTV TLKDSYENLV GGQRDAINLA IQLPNTKAES IAWNEDQKGI YTATYTALLP
GTGLKAQLQM SGWANALTSN DYSISGDAAS AQIVAMQVTT GNPDVLANGS DRHTVNVRVE
DQFGNVLSEQ TVTFTVTKGA AVFANAGQSA DIRTDAHGMA EVDLSSTVAD ASTVEAKINQ
SSDSKTVNFV ADVSTAQVAE LVVTQDGSVA DGSTANTLRA RVTDVFGNAL AGQTVSVLAG
NGATTAPTVT TQPDGTVEIS VTSQTAGTSV ITASVNNSSQ SRDVTFIADV RTAQIADLVV
IKDGSEADGA TANTLRARVT DAFGNALAGQ TVSVLADNGA TVAPVVTTQP DGTVEISVTS
QTAGSSAVTV SINSSSQSRD VTFIADVRTA QIADLVVIKD DSVADGAMAN MLRARVSDVF
GNALAGQTVS VMADNGAAVA STMTTKPDGT VEISVTSQTA GISVVTASIN NSSQSQNVTF
VADVRTAKIA DLVVSQDNAV ADGSTANTLR ARVTDAFGNT LAGQTVSVMA GNGATVAPTV
ITEPDGTAEI SVTSQTAGVS AVTASINNSS QSRDVTFIAD IRTAQIASLE VTQDNAVADG
AMANTLQVRV TDANGNTLAG QAVSVMAGNG ATVAPAVTTQ PDGTVEIPVT SQTAGASAVT
ASINNSSLSR DVTFIADVRT AQIAELVVIK DGSAADGATA NTLQARVTDA FGNALAGQTV
SVLADNSATV APAVITEPDG TVDISVTSQT AGISTVTATI NNHSLSQSVM FIADVRTAQI
ADLVVIKDGS EADGATANTL RARVTDAFGN ALAGQTVSVL ADNGATVAPT VITGQDGTVE
ISVTSQTAGI STVTATINSS SQSQNVTFIA DVRTAQIAEL VVIKDGSAAD GVMANMLRAR
VTDAFGNALA GQTVSVLAGN GATTAPTVTT QPDGTVEISV TSQTAGISAV TASINNSSQS
RNVTFIADVR TAQIADLVVI KDGSEADGAT ANTLRARVTD AFGNALAGQT VSVTAGNGAT
VAPTVTTQPD GTAEISVTSQ TAGVSAVTAS INNSSQSRDV TFIADISTAQ IADLVVIKDG
SEADGATANT LRARVTDAFG NALAGQTVSV MAGNGATVAP AVTTQPDGTV EISVTSQTAG
ISAVTASINS SSQSRDVTFI ADVRTAKIAE LEVIRDNAVA DGSTANTLQV KVTDANGNTL
AGQTVSVLAG NSATVASTVT TKPDGTVEIS VTSQTAGTST VSASINNSSQ SQNVTFVPGD
ASQLTSTVET NKSNYTVGET ITITVTLRDA FDNLVTGAAS QLAADGVLTV AGTDPSETGS
WVESGGVYTT TRMATIASTN QHANLQLQTW SDGVTSDRYD IQSGSPAQAT STIATDKNAY
TAGDTITVAV TLKDAHGNLV EGGESLLSGD NVTVEGAVRS GGWSETAGVY TATWSAQMAG
DSHHATLKLS EWGSSKQSES YSIHSGAPVQ ANSAIRTDKS AYIAGEPLTV TITLRDEFGN
PALGLTSEVI ESYIDSFAVG GATPDSMRWV EQNNGEYTIV WTAWIAEENL VASLKLKTWA
EEIKSSLYGI QPGAAAKTQS TIVADKTIYI AGDSITVTVV LKDAQGNFIT DGVVQLNEEN
VQVRNADPIQ GNNWVYNGNG QYQRQYMAHF AEANLNAQLK MAGWSDANYS NNYTIKPGEV
SPLGSQLRIR EVLVVEGADL PVSVLLVDDF GNPVDNGLDL LDDTVYLQNV EKKEGEKWRY
VGDGIYERTY MAYQEGENLT SFMEIKGWRI YGQPSYTILP FVEVELLSVN GVKFRATDGF
PETGFDGAKF TLLLTHNMKN TDYNWTAGIY GINVDSNGEV TLSVLIRSEV TITGKPKNGK
GNDVVFKFKI KKWFTSLGAT SSNTWDIINT SCSYGQMPSS LELAQRPSGG VVPRKVGTLW
GEYGNLKTYG NAFSGTDYWT STQLMGVHEK FNPETGISEL GTGKSSGLCV EYY