Gene Sare_2021 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagSare_2021 
Symbol 
ID5704458 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameSalinispora arenicola CNS-205 
KingdomBacteria 
Replicon accessionNC_009953 
Strand
Start bp2311854 
End bp2313062 
Gene Length1209 bp 
Protein Length402 aa 
Translation table11 
GC content67% 
IMG OID641271512 
Productglycosyl transferase family protein 
Protein accessionYP_001536883 
Protein GI159037630 
COG category[C] Energy production and conversion
[G] Carbohydrate transport and metabolism 
COG ID[COG1819] Glycosyl transferases, related to UDP-glucuronosyltransferase 
TIGRFAM ID[TIGR01426] glycosyltransferase, MGT family 


Plasmid Coverage information

Num covering plasmid clones17 
Plasmid unclonability p-value0.887952 
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones
Fosmid unclonability p-value0.0195486 
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGCACCAGC GGCACATTCT CTTCGCCAAC GTCCAGGGAC ACGGACACAT CTACCCCTCG 
CTCGGCCTGG TGAGCGAGTT GGTTCGGCGG GGCCACCGGA TCAGCTACGT CACGACACCG
CTCTTCGCCG ACGTGGTGAC CACGGCCGGC GCCACCGTGG TGCCGTACAA GTCCGAGTTC
GACACTTTCC ACGTACCCGA CGCCGTCACG AAGGCGGACG CGGAAACCCA GTTACACCTG
GTGTACGTCC GGGAGAACGA GGCGATCCTG CGCGCCGCCG AGGACGGGCT CGGCGACGAC
ATCCCGGATC TCGTGGCCTA CGACGTCTTC CCGTTCATCG CCGGACGATT GCTGGCGACC
CGGTGGCGTC GTCCGGCCGT CCGCCTCAGC GCCGGCTTCA CCGCCAACGA GCACTACTCC
CTCTTCGAAG AGCTGTGGAA GAATCACGGT CAGCGCCACC CTGCGGACGT GCCGGAGGTC
AACGACGTGA TCGTCGATCT GCTCGCCCGC TACGGGGTAC ACACCCCGGT CCGACAATTC
TGGAACGAAA TCGAAGACCT CAACATCGCA TTCCTGCCGA AGTCCTTCCA GCCCTTCTCG
GAAACCTTCG ACAACGACCG TTTCGCCTTC GTCGGGCCGA CGCTGACCGA TCGGCCGGTA
CGCACCGGCT GGCAGCCCCC CGCTCCGGAC ACTCCGGTGA TCCTGGTGTC ACTTGGCAAC
CAGTTCAACG AGCATCCAGA GTTCTTCCGG ACCTGCGCCG AGGCGTTCGC CGACACCCGG
TGGCAGACAG TGTTGGCGAT CGGCACCTTC CTCGACCCCG CCGCTCTGGA TCCACTGCCG
CCGAACGCAG AGGCCCATCC GTGGATCCCG TTCCACGAGG TCCTGCCACA CGCGGACGCC
TGCGTCACCC ACGGTACGAC CGGCGCCGTG CTGGACTCCC TGGCAGCCGG CGTACCACTG
GTCCTGGTGC CGCACTTCGC GACCGAGGCC GCACCATCGG CGCGCCGCGT CGTCGAGCTG
GGGCTCGGCT ACGAGCTGAC GCCGGAGCAG GTCGAGCCGG CGACCATCCG AGCTACCGTG
CAACGGTTCC ACGACGACAC GGCGCTGCGC GAGCGGGTCG ATCGGATGCG ACACGAGATC
CAGACGGCCG GTGGCCCGCC GGCGGCAGCC GACCGGATCG AGAGGTACCT GAACCAGTCC
CGGGGATGA
 
Protein sequence
MHQRHILFAN VQGHGHIYPS LGLVSELVRR GHRISYVTTP LFADVVTTAG ATVVPYKSEF 
DTFHVPDAVT KADAETQLHL VYVRENEAIL RAAEDGLGDD IPDLVAYDVF PFIAGRLLAT
RWRRPAVRLS AGFTANEHYS LFEELWKNHG QRHPADVPEV NDVIVDLLAR YGVHTPVRQF
WNEIEDLNIA FLPKSFQPFS ETFDNDRFAF VGPTLTDRPV RTGWQPPAPD TPVILVSLGN
QFNEHPEFFR TCAEAFADTR WQTVLAIGTF LDPAALDPLP PNAEAHPWIP FHEVLPHADA
CVTHGTTGAV LDSLAAGVPL VLVPHFATEA APSARRVVEL GLGYELTPEQ VEPATIRATV
QRFHDDTALR ERVDRMRHEI QTAGGPPAAA DRIERYLNQS RG