Gene Mlg_2394 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagMlg_2394 
Symbol 
ID4269391 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameAlkalilimnicola ehrlichii MLHE-1 
KingdomBacteria 
Replicon accessionNC_008340 
Strand
Start bp2716823 
End bp2719741 
Gene Length2919 bp 
Protein Length972 aa 
Translation table11 
GC content67% 
IMG OID638127152 
Productvon Willebrand factor, type A 
Protein accessionYP_743224 
Protein GI114321541 
COG category[R] General function prediction only 
COG ID[COG4245] Uncharacterized protein encoded in toxicity protection region of plasmid R478, contains von Willebrand factor (vWF) domain 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clones
Plasmid unclonability p-value0.400523 
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones14 
Fosmid unclonability p-value0.00000432302 
Fosmid HitchhikerYes 
Fosmid clonabilityhitchhiker 
 

Sequence

Gene sequence
ATGAACATCA GACACGTCGA CCGCCATGTC GAGAAACGGC GGTCCTGGTA CACCGGGCTG 
TTCTGCGCCG GCCTCAGCGC CGCCATGATC AGCGCACCGC TCTCCGCCGG GGATTCCTGG
TTCTCCAAGG CGCGGATCGG TGCCGATACG GCCCCGGCGA GCGCCCAGGG CCTCACCGCC
GCGGGCAGCG ACCGGGTGGT GGGGGCCTCC ACGGACGCCG GCACGACCGT CTTCGATATC
ACCATTTCCC TGCACAACAA CCCGGAGACC GAAAACGAGC GGGATCCCTA CGAGGCCGTG
ATCGAACACT TCGCCGACTC GGTCTGCGAG CAGAGCAACG GGGCCAGCCA GCTCGGTACC
GTGCGGGTCT TCACCAACGG CGGCCACGGC TCCCGGGCCG ACATTATCTG GAACGAGGAG
GAGTGGCCGC GGGCCAGTAT CGCGGGGTTC GGCAGTGCCG GGCGGCACAT CTGGATGGGG
GATATCTTCC CCGACGGCTG CGGTAATGGC TGCGACTACG ACATGCTGGC CGATGCCGAG
GGCGCCGGCT ACACCCTGGG GCACGAGTGG GGCCACTACG TGCTTGCCCT GTATGATGAG
TACGAGGGGC GCGACCCGGC GGAGAATCGC GATACCTTCC CGCAGGTTGG CGATGTGCCC
ACCAGCCCGG CCATCATGAA CAGCCAGTGG CAGGCCCGCG GCGGTAACTA CGAGTGGTTG
AACCACTCCA CCAGTGACAA CATCGGCGAT CCGGAGGATA CCGCCCAGGG CCGGGTCTAC
GGCAAGAGCG GCTGGGAGGT GTTGGTCCAA CCCACCACCG ACGATCCACA GGAGGGTAAC
GAGACCGTTC AGCCTGACCG GACCCGGTAT ACGGCCCTGG AGGCCGTGGC GCCGACGGCG
GCGGATAACT GGGTGGTCAC CCAGCTCGAT CAGATGGATC ACGGTTGCCG CGATGAGCTG
GAGATCGTCT GGATGGACGA TGACCTGGAG ATCTCGCTCA TTGTCGACAC CTCCGGGAGC
ATGAGCGGCG CTCCCATCAT CAACGCCCGC ACAGCCGGTC GGACCCTGGT GGATGTGGTC
GAGCCTGGCC GTACCGCCAT GGGCGTCGTG CGCTTCTCGG CGAGTGCCTC GGTGGTCCAC
CCCATGATCG CCATCCCGGA CCCGGGTACG GCGGAAAAGG ACCAGCTCAA GGACGCCATC
GACAGCCTCC CGGCCTCCGG GCTGACCGCC ATGTTCGATG GCCTGATACT GGGCTTGGAC
GAACTGCAGG ATTACAGCGC CGCCAACGAT ACCGATGCCG GGCAGGTGGC CTTCCTGCTC
TCCGATGGTG GCGACAACAG CTCCGCTGCG ACAGAGCCGC AGACCGTCCA GGCCTACCAG
GATGCCAACG TCCCCATCAT CGCCTTCGGC TATGGCAGCT TCGCACCCAC CGGGGTGTTG
CGGCGGCTCG CCGATAACAC CGGCGGTGAG TTCTTCGCCT CACCCACGAC CCTGGCCGAG
ATCCAGGAGG CCTTCCTGGC GGCCAACGCC GCCGTGTCCG ATGCGGTCAA CCTGAGTCAG
GAGTCGCAGC CGGTCGCTGC GGGCGCCAAC GAGCGGCTCA CCTTCACCGT GGACCCCACC
CTGGGCTCGA TGACCGTGCT CCTTAACTTC ACCGGCAGTG CCGACCAGTT GTCCCCCACG
CTGTTGGACA GTGACGGCAA CGACACCGGA ATCCCGTTCA GCTGCGACGA GTCAGCCGAT
GAGGTCTCCT GCCTCGCCAC CGTGGACCGC GACGCGGTAT CGGCGGGTGG CGTCGGCGAC
TGGACCGTCG ACACCGGGGA GAATACCTCC GGCGGTGAAG TGGAGGTGCT CCTGAATGTG
GTGGCCAACC CGGCGGACGG TCGGACCTTT GACGTGCGGG TCAGTACCCT GGGTGGCAGC
ACGGTGGAAT ACCCGTCACC CGCGCTGATC TCGGCCGCGA TCTCCGCCGG GCGGATGGTC
TCGGGGGTCA ATGTGGTCGC CGAGCTGACC GATCCCGACG GGAACGTCAC CACCGTGCCG
CTCAACGACG AGGGCCAGAA CGGTGACGCG GAGGCCGGTG ATGGCATCTA CTCGGCCGTG
GTGAACTACC GGCAGGGTGG TACCCACGAG CTGCGCGTCC GGGTGGACAA CCAGGCCGGG
ACCGGCCAGT TCGTGTACGG GGGTGTGGCC CCGGCGCCGG ACATCAATGG CATGGAGGTG
CAGGCCCCGG ATCCGGAGCC GATCCCGGAG AACTTCCAGC GCGTGGCGAC CACGCAGTTC
ACCGTGGATG GTTTTCAGGA CGACGATCAC GCGGATGATC CCGCCCTGCC GGGCGCCTGC
ACGCCGCTCG AGGCCGATAA CACGACTATC CCGGGGCGGA TCGATGCCGC CAGCGATAGG
GATTGCTTCC GCCTGGTCGG TACCAGCGTC CCGGACGAGG GCGATGTCTC TCTGCGGGTG
GCATCCTTTG GCCTGGGCAT GCAGCCCATC GTGACGATCT ACACCGGCGA TGGCAGCGAG
GTGCTGCTGA CCTTCAGCCT GGACGATGAG GACTTCGTCG CCCGCAACGG CTACCTCTAC
ACCGTCCTGG ATCGGGACTG GCTGGAGACC GCCAATGACG AAGGTATGGT TGGCGGTGCG
GAGCTGCAGG ACCTGGTGGT CACGGTCGAG CATGAGGATG AGACCGCCGA CGAGGGCACC
TACAAGGTCA GCGCCGGCTC GGTGATCAGC TCGGACCAGC CCGCGGAACC GGACCGGATC
ACGGACGAGG ATGAGGAGGA GTTCGAGATG ACCCGCCGGG GTTCCGCCTG CAGCGTGGCG
GGCAACAGTG CCGGTGGCCC CGCCGACCCG ACCCTGCCGC TGCTGGCCAT CCTGGCCCTG
CTCGGCGTCA TCCTGGGCCG TCGTCGCCAC CGCGCCTGA
 
Protein sequence
MNIRHVDRHV EKRRSWYTGL FCAGLSAAMI SAPLSAGDSW FSKARIGADT APASAQGLTA 
AGSDRVVGAS TDAGTTVFDI TISLHNNPET ENERDPYEAV IEHFADSVCE QSNGASQLGT
VRVFTNGGHG SRADIIWNEE EWPRASIAGF GSAGRHIWMG DIFPDGCGNG CDYDMLADAE
GAGYTLGHEW GHYVLALYDE YEGRDPAENR DTFPQVGDVP TSPAIMNSQW QARGGNYEWL
NHSTSDNIGD PEDTAQGRVY GKSGWEVLVQ PTTDDPQEGN ETVQPDRTRY TALEAVAPTA
ADNWVVTQLD QMDHGCRDEL EIVWMDDDLE ISLIVDTSGS MSGAPIINAR TAGRTLVDVV
EPGRTAMGVV RFSASASVVH PMIAIPDPGT AEKDQLKDAI DSLPASGLTA MFDGLILGLD
ELQDYSAAND TDAGQVAFLL SDGGDNSSAA TEPQTVQAYQ DANVPIIAFG YGSFAPTGVL
RRLADNTGGE FFASPTTLAE IQEAFLAANA AVSDAVNLSQ ESQPVAAGAN ERLTFTVDPT
LGSMTVLLNF TGSADQLSPT LLDSDGNDTG IPFSCDESAD EVSCLATVDR DAVSAGGVGD
WTVDTGENTS GGEVEVLLNV VANPADGRTF DVRVSTLGGS TVEYPSPALI SAAISAGRMV
SGVNVVAELT DPDGNVTTVP LNDEGQNGDA EAGDGIYSAV VNYRQGGTHE LRVRVDNQAG
TGQFVYGGVA PAPDINGMEV QAPDPEPIPE NFQRVATTQF TVDGFQDDDH ADDPALPGAC
TPLEADNTTI PGRIDAASDR DCFRLVGTSV PDEGDVSLRV ASFGLGMQPI VTIYTGDGSE
VLLTFSLDDE DFVARNGYLY TVLDRDWLET ANDEGMVGGA ELQDLVVTVE HEDETADEGT
YKVSAGSVIS SDQPAEPDRI TDEDEEEFEM TRRGSACSVA GNSAGGPADP TLPLLAILAL
LGVILGRRRH RA