Gene Haur_4043 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagHaur_4043 
Symbol 
ID5735905 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameHerpetosiphon aurantiacus ATCC 23779 
KingdomBacteria 
Replicon accessionNC_009972 
Strand
Start bp5162033 
End bp5163304 
Gene Length1272 bp 
Protein Length423 aa 
Translation table11 
GC content53% 
IMG OID641281194 
Productvon Willebrand factor type A 
Protein accessionYP_001546803 
Protein GI159900556 
COG category[R] General function prediction only 
COG ID[COG2304] Uncharacterized protein containing a von Willebrand factor type A (vWA) domain 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clones14 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clonesn/a 
Fosmid unclonability p-valuen/a 
Fosmid Hitchhikern/a 
Fosmid clonabilityn/a 
 

Sequence

Gene sequence
ATGAGTACCC AAGCAATGGT TCAACTGCGC ATCACCCCTG GGCGGCCAGC AGTTGCCCAA 
AGTAACGATC CGCAAATTGT TTATTTGTTG GTCGAAGCCT CGCCTGCTGG CATTCCCGAT
GCCGATTTGG CGATTCCGGT CAACCTTGGT TTTATTGTTG ATCGTAGCTC TTCGATGCGC
GGCGAACGGC TTTACCAAGT CAAAGAGGCT TGTAATAACG TCGTTAATCA GCTCAATCGC
CAAGATTATT TCTCGGTGGT GTCGTTCAAC GATCGGGCTG AGGTGGTTGT ACCCTGCCAA
CGCCCCAACG ATAAAGACCA AATTAAACGC GCGATTGGCA TGATCGAGGC CAAAGGTGGC
ACTGAGATGG CCACAGGCAT GATGATGGGC TTACAGGAAA TTTCACGCCC TATGATGAGC
CGCGGCATCA GCCGTATGGT CTTGTTGACC GATGGCCGTA CTTATGGCGA TGAAAGCCGC
TGTGTCGAAA TTGCGCGGCG TGCTCAATCC AAAGGCATCG GCATTACAGC CTTGGGCATT
GGCGATGAGT GGAATGAAGA TCTGCTCGAA ACAATCGCCT CAGCCGAAAA CAGCCGCACC
GAATATATCA CCAATGCTCA GCAAATTGTC AACGTTTTCT CTGACGAGAT CAAACGTTTG
CAAAATGTGA TGGCGCATAA AGTTGAAATG CGTTTCCATC TGCACCCGCA GGCTGAAATT
CGTTCGCTGT TTCGGGTGCG CCCATTTATT GCCGCGCTCA CCCCACAATT GCATAACGAA
ACGCTGTGGC GTATGCCACT GGGCGAGTGG GTTGGCCGCG AAGATCAAAT CTTTTTGTTA
GAGCTGGTCG TGCCGCCGCT GCCCGCAGGC AATCAAACGA TCTGTCGGAT CGAGATGTTT
TACGAAGTGC CCAGCATCAG TAGCCAAGCC TTACAAACCA AGGTCGATGT CCAACTGCCG
GTACGGCCTG CCGAGCAAAT TCGGCCTGAT GTTGATGGCG TGGTTAAACA TTGGCTCGAA
CGCACCGTGG CCTATCGTTT GCAAGCCTCG GCGTGGCAAC ATGTTGAGCA AGGCAACATC
GAGGAAGCGA CCAAAAAGTT ACGCATGGCT GGCACACGCT TGCTCGAATC GGGCCAAACT
GAGCTTGCCC AAACCGTTCA AGAAGAGGCC ACCCGCCTGT TGCGTAGCGG CACAACCAGC
GATGAAGGTC GCAAACGGAT TAAATACGGC ACCCGTGGCT TGGTCGCTCG CGAACGGGGC
GGAGAGCAAT AG
 
Protein sequence
MSTQAMVQLR ITPGRPAVAQ SNDPQIVYLL VEASPAGIPD ADLAIPVNLG FIVDRSSSMR 
GERLYQVKEA CNNVVNQLNR QDYFSVVSFN DRAEVVVPCQ RPNDKDQIKR AIGMIEAKGG
TEMATGMMMG LQEISRPMMS RGISRMVLLT DGRTYGDESR CVEIARRAQS KGIGITALGI
GDEWNEDLLE TIASAENSRT EYITNAQQIV NVFSDEIKRL QNVMAHKVEM RFHLHPQAEI
RSLFRVRPFI AALTPQLHNE TLWRMPLGEW VGREDQIFLL ELVVPPLPAG NQTICRIEMF
YEVPSISSQA LQTKVDVQLP VRPAEQIRPD VDGVVKHWLE RTVAYRLQAS AWQHVEQGNI
EEATKKLRMA GTRLLESGQT ELAQTVQEEA TRLLRSGTTS DEGRKRIKYG TRGLVARERG
GEQ