Gene Hlac_0123 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagHlac_0123 
Symbol 
ID7401644 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameHalorubrum lacusprofundi ATCC 49239 
KingdomArchaea 
Replicon accessionNC_012029 
Strand
Start bp129671 
End bp131146 
Gene Length1476 bp 
Protein Length491 aa 
Translation table11 
GC content65% 
IMG OID643707187 
Productvon Willebrand factor type A 
Protein accessionYP_002564799 
Protein GI222478562 
COG category[R] General function prediction only 
COG ID[COG2425] Uncharacterized protein containing a von Willebrand factor type A (vWA) domain 
TIGRFAM ID[TIGR01409] Tat (twin-arginine translocation) pathway signal sequence 


Plasmid Coverage information

Num covering plasmid clones
Plasmid unclonability p-value0.0626455 
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones30 
Fosmid unclonability p-value
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGAACCCGG ACACAAACGA CACAATCGGA CTCTCGCGAC GCACGCTGCT CGCCGGCCTC 
GGTGCCGTCG GCGTGGCCTC TGCGGGCGCG GGACTCGGGA CGACGGCGTA TTTCAACGAC
ACCGAGTCGT TTGAGGGTAA CACGCTCACC GCGGGCGAGC TCGATCTCAA GCTCGACTAC
CGCGCCACGT ACGCGGGCGG CCCTGGGCGC ATCGATGAGA TCAACGGCTG GTATCCGGAC
TTCGAGGTTG AAGAAGAGGA GGACGGCGTC TACCTCATCG GCGAGGTCCC GAATATCGAC
GAGGGCGAGT GGCCCGATAT CGTGCAGGAG CGCGACTTCT GTGCGCCCGA CGTGGGCCTG
ATCAACGGCG ACCAGATCCC GGTCTTCACG CTCGACGACG TGAAACCCGG TGATTGCGGC
GAGGTGACGA TCAGCCTCCA CATCTGTGAC AACCCCTCGT GGGTGTGGAT GAACGGTGAG
CTAACCGCGA ACGACGAGAA CACCGTCTCC GAGCCCGAGG CGGGCGCGGA CGGTGAGGGT
AACGCGCTCG GCGACGACAG TGACGGCCCG ATCAGCGGTG AGGGCGAGCT GGCCGACGCC
ATCGACGTGA CGGTATGGTA CGACGAGAAC TGCAACAATA TCCTGGACGC CGACGCCGAG
GAAGCAGGGG ATTCGGTGTG CGTTCAGCTC GTGATCGACA CGTCCGGATC GATGGGTGGC
AGCCGTATCG CCAACACGAA AAGCGGCGCG AAACAGCTTG CGGAAACTAT CCTCGATGCG
AATCCCGACA ACCAGGTCGG CGTCACGCGA TTCAACAACG GAGCCAGCAC GCCGCAACAG
CTGACTGATG ACCTGGACGA CGTCGAGGCC GCGATCGACG GGCTCAGCGC GAGCGGAGGC
ACCAACGCTC AGGCCGGCGT GGACGCCGGA CAGGCTGAAC TCGAGAACTG CCCCCACGAC
AACCGCGTGA TGGTCGTCTT CGGCGACGGC GACATCAACA CCGACGGCAG CGCGGCGAAA
GTCGCTGGGA CCGAGATCTT CGCGATCGGC GTCGGCGGCG CGAGCTTCAG CGACCTCGAG
GACCTCGCCA GCGATCCGGC CGACGAGCAC GTGTTCTTCG CGATCGACGA CGGGGCCATC
GAGCAGATCT TCGGACAGGT CGCCGAGACG ATCACAGTCG GCGAGGAGGT CATCTTCGAG
GGATCGCTCG CCGGGGCGAT GGCTGCGCTC TCGGATGGGA TTGCGCTTGA CGGCAACCGC
TCTGTCGAGG GCCGACAGCC CTACGAGGGC GGCCTCACCC AGTGTATCGG CTTCGAGTGG
TGTCTGCCTG CAGAGGTCGG CAACGAGGTC CAGACCGACT CCGTGGCCTT CGATCTCGGC
TTCTACGCCG AACAGTCGCG GCACAACGAC GCGCCGGACG TGTCGTTCAA CGGAACGGCC
CCGACCAGCA ACAGCACCGC AAACGTGTCG CAGTAG
 
Protein sequence
MNPDTNDTIG LSRRTLLAGL GAVGVASAGA GLGTTAYFND TESFEGNTLT AGELDLKLDY 
RATYAGGPGR IDEINGWYPD FEVEEEEDGV YLIGEVPNID EGEWPDIVQE RDFCAPDVGL
INGDQIPVFT LDDVKPGDCG EVTISLHICD NPSWVWMNGE LTANDENTVS EPEAGADGEG
NALGDDSDGP ISGEGELADA IDVTVWYDEN CNNILDADAE EAGDSVCVQL VIDTSGSMGG
SRIANTKSGA KQLAETILDA NPDNQVGVTR FNNGASTPQQ LTDDLDDVEA AIDGLSASGG
TNAQAGVDAG QAELENCPHD NRVMVVFGDG DINTDGSAAK VAGTEIFAIG VGGASFSDLE
DLASDPADEH VFFAIDDGAI EQIFGQVAET ITVGEEVIFE GSLAGAMAAL SDGIALDGNR
SVEGRQPYEG GLTQCIGFEW CLPAEVGNEV QTDSVAFDLG FYAEQSRHND APDVSFNGTA
PTSNSTANVS Q