Gene Haur_2138 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagHaur_2138 
Symbol 
ID5734040 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameHerpetosiphon aurantiacus ATCC 23779 
KingdomBacteria 
Replicon accessionNC_009972 
Strand
Start bp2690675 
End bp2693503 
Gene Length2829 bp 
Protein Length942 aa 
Translation table11 
GC content53% 
IMG OID641279279 
Productcellulose-binding family II protein 
Protein accessionYP_001544906 
Protein GI159898659 
COG category[R] General function prediction only 
COG ID[COG3889] Predicted solute binding protein 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clones
Plasmid unclonability p-value0.368421 
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clonesn/a 
Fosmid unclonability p-valuen/a 
Fosmid Hitchhikern/a 
Fosmid clonabilityn/a 
 

Sequence

Gene sequence
ATGCGAATGC TAAAGATGGC CGCAGCAGTA GTGGTGCTGT TTGCAGGCGT AATTCAAGCG 
ATGGTGGCCG TGCCGCAATC GCGGGCGGCA AGTACTCAGC CATACAATTG GTCGAATGTC
GAGATTGTTG GCGGTGGTTT TGTCCCAGGC ATTATTTACA ATCCAACTGA ACGCGATTTG
GTGTATGCCC GCACCGATAT TGGTGGCGCA TACCGCTGGA ACCCCACAAC CAAGCGCTGG
ATTCCCCTAA CCGACTGGAT TGGCGCAACC GATTGGAACT GGACAGGGAT CGATAGCCTC
GCGACCGACC CAGTTGATCC TAACCGCCTC TATTTGGCCG CTGGCACTTA TACCAACGAT
TGGACTTCGG CCAACGGCGC GATTTTGCGC TCGACCAACA AAGGCAATTC GTGGGCACGC
ACCGACCTAC CCTTTAAACT TGGCGGTAAT ATGCCAGGCC GCTCGATGGG CGAACGTTTG
GCGATTGACC CCAACAAAAA TAGCGTGCTT TACCTTGGTA CACGCGGCCA AGGCCTCTGG
CGCAGCACTG ATTATGGCGC AACCTGGGCG CAAGTTACCA GTTTTCCGGC GATTGGCAAC
TATACCGATC CCTACTTCAA TGACCGAATT GGGGTAGTAT GGGTAACGTT TGATCCACGG
ACTGGCACTA GTGGCAGCGC TTCGCAAACG ATTTATGTTG GGGTTGCTGA TACGACCACC
AGTATTTATC GCAGCACTGA TGGTGGGGCA ACTTGGGCAG CATTACCCGG CCAACCAACT
GCTGGCTATT TACCCCATCA TGCTACCCTC GCCTCAAATG GCATGCTCTA CATTACCTAC
AGCGATACAC CTGGCCCCTA CGATGGTGGC AAGGGTGATG TTTGGAAATA TAACACCGCC
ACCCAAGCTT GGACGAACAT TAGCCCAATT CCTTCGAGCA GCGCCGACGA TTATTTTGGC
TATGGCGGCT TGGCTGTCGA TGCGCAAAAT CCCAATACGT TGTTGGTAAC TAGCCTCAAT
TCATGGTGGC CTGATGCCTT GATTTATCGC AGCACCGATG GCGGCGCAAC ATGGAGCGGT
ATTTGGGAAT GGGCCAACTA TCCCGAACGC ACCTTCCGCT ATACCCAAGA TATTAGTACA
TCGCCATGGC TCGACTTTGG TGGTAATGCA ACCGCACCTG AAGTTGCACC CAAATTAGGC
TGGATGATCG GCGACATCGA GATTGATCCA TTCAACTCAG ATCGCATGCT CTATGGCACT
GGCGCAACGA TTTATGGCAC CGATAATTTG ACCGCTTGGG ATACTGGCGG CAAAGTTGCG
CTTGATGTGC GTGCCCATGG CCTTGAAGAA ACCGCAGTTC TCGAATTGGT CAGCCCACCA
GAAGGCGTGC ACTTGGTGAG TGCCTTGGGT GATATTGCTG GCTTCTATCA TACCGATCTG
ACTAAGGCTC CACCATGGTC GGCTAGCCTC AAATTTGGCT CATCGACCAG TATCGATTTT
GCTGGAACCC GCCCGAATGT GATGGCTCGC ACAGGCTATG TTGGCGAAGG CAGCACGGTC
AAACGGGTGG CGTGTTCATG GGATGGCGGC CAAAATTGGT CGCACGCCAG CAGTGAACCA
ACTGGCTCGG TTGGTGGCGG AAAAATCGCC ATGGCTTCCG ATGCTTCGAT GATTGTTTGG
AGCGGTACGA CTGCTCCAGT CAGTTACTCG ATGGGCTGTG GTAATAGTTG GTCGGCGAGT
GCTGGTATTC CGGCTGGGGC AGTCGTGATT GCTGATCGGG TGAAATCGAC GACTTTCTAT
GGCATCGCCA ATGGTACGCT CTATCGCAGC ACCGATGCAG CCCGCAACTT TACCGCTGTT
GCAACGGGCT TGCCCACCAA GAGCGCCAAA ATCGAGGCTG TTTTGGGCAT CGAAGGCGAT
TTGTGGTTGG CTGGCGGCCA AGATGGCTTG TTCCATTCAA CCGATGGTGG CACAACATTC
AGCAAAATTG CTGGAGTGCA GGTTGCCAAT GTGGTTGGTT TTGGTAAGGC AGCACCCAGC
CAAAGCTATC AAGCAATCTA CATTACTGGA GCAATTGATG GCGTAGAAGG CTTCTTCCGC
TCAGATGATG GCGGCACAAC ATGGGTGCGC ATCAACGACG ATCAACATCA ATATGGCTCG
ACCAACGAAA CTATTACTGG CGACCCACGG ATTTATGGGC GGGTTTATAT TGGTACGAAT
GGCCGTGGGA TTATTTATGG TGATATTGCT GGCTCAACTC CGACCAATAC ACCAGCCCCA
GCCACGGCGA CCGCAACTCG CATACCAACC GCAACTTCAA CCAGCGTACC ACCGACGGCA
ACGAGTGTTC CGCCAACCGC GACCTCGACT CGCGTGCCAA CGGCGACTCC GCTGACTTCA
ACTGCGATTC CCTATACGCC AACGCCAACC TCAACCCGTG TACCAACCGC AACCCCAATT
ACGCCAACCA GTTTGCCATA CACGCCAACG TCAACTCCAG CGCCTGGTGG CTGCCGGATT
AACTATGTGG TTAATCAGTG GAATACCGCA TTTACTGGCG GTGTGACAAT TACTAATTTG
GGAGCGCCGA TTAGTGGTGG TTGGACGTTG ACTTGGAATT TTGCCAATGG TCAAACGGTG
ACGAGCAGTT GGAATACTGT GCTGACCCAA ACTGGTTCAG CTGTGACTGC CAAGCATAGC
GCCGATTGGA ATGCCAATAT TCCAACCAAT GGCACACAAA GCTTTGGCTT TATTGCTAGT
CATAATGGCA CAAATGCCGT GCCAACCAAC TTTGTTTTGA ATGGCGTTGC TTGTAGCGTT
GCGCCATAG
 
Protein sequence
MRMLKMAAAV VVLFAGVIQA MVAVPQSRAA STQPYNWSNV EIVGGGFVPG IIYNPTERDL 
VYARTDIGGA YRWNPTTKRW IPLTDWIGAT DWNWTGIDSL ATDPVDPNRL YLAAGTYTND
WTSANGAILR STNKGNSWAR TDLPFKLGGN MPGRSMGERL AIDPNKNSVL YLGTRGQGLW
RSTDYGATWA QVTSFPAIGN YTDPYFNDRI GVVWVTFDPR TGTSGSASQT IYVGVADTTT
SIYRSTDGGA TWAALPGQPT AGYLPHHATL ASNGMLYITY SDTPGPYDGG KGDVWKYNTA
TQAWTNISPI PSSSADDYFG YGGLAVDAQN PNTLLVTSLN SWWPDALIYR STDGGATWSG
IWEWANYPER TFRYTQDIST SPWLDFGGNA TAPEVAPKLG WMIGDIEIDP FNSDRMLYGT
GATIYGTDNL TAWDTGGKVA LDVRAHGLEE TAVLELVSPP EGVHLVSALG DIAGFYHTDL
TKAPPWSASL KFGSSTSIDF AGTRPNVMAR TGYVGEGSTV KRVACSWDGG QNWSHASSEP
TGSVGGGKIA MASDASMIVW SGTTAPVSYS MGCGNSWSAS AGIPAGAVVI ADRVKSTTFY
GIANGTLYRS TDAARNFTAV ATGLPTKSAK IEAVLGIEGD LWLAGGQDGL FHSTDGGTTF
SKIAGVQVAN VVGFGKAAPS QSYQAIYITG AIDGVEGFFR SDDGGTTWVR INDDQHQYGS
TNETITGDPR IYGRVYIGTN GRGIIYGDIA GSTPTNTPAP ATATATRIPT ATSTSVPPTA
TSVPPTATST RVPTATPLTS TAIPYTPTPT STRVPTATPI TPTSLPYTPT STPAPGGCRI
NYVVNQWNTA FTGGVTITNL GAPISGGWTL TWNFANGQTV TSSWNTVLTQ TGSAVTAKHS
ADWNANIPTN GTQSFGFIAS HNGTNAVPTN FVLNGVACSV AP