Gene Sare_4274 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagSare_4274 
Symbol 
ID5705779 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameSalinispora arenicola CNS-205 
KingdomBacteria 
Replicon accessionNC_009953 
Strand
Start bp4850449 
End bp4852041 
Gene Length1593 bp 
Protein Length530 aa 
Translation table11 
GC content67% 
IMG OID641273693 
Productthiamine biosynthesis protein ThiC 
Protein accessionYP_001539046 
Protein GI159039793 
COG category[H] Coenzyme transport and metabolism 
COG ID[COG0422] Thiamine biosynthesis protein ThiC 
TIGRFAM ID[TIGR00190] thiamine biosynthesis protein ThiC 


Plasmid Coverage information

Num covering plasmid clones15 
Plasmid unclonability p-value0.662161 
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones
Fosmid unclonability p-value0.0045509 
Fosmid HitchhikerYes 
Fosmid clonabilityhitchhiker 
 

Sequence

Gene sequence
ATGCAGAAAC GCCGCAAGGT CTACGTCGAG GGGTCGCGGC CGGATATCCA GGTGCCGTTC 
GCCGAGATCG ACCTGACCGG TGACAACCCG CCGGTCCGGC TCTACGACAC GTCGGGGCCG
GGTTCCGATC CGGAGGCGGG GCTGCCGCCG TTGCGTGGCC GGTGGATTGC GTCCCGCGGG
GACGTTGCTC CGGTGCGCGG TGCCGGTACG CCGTTGGCGG GGGTGGACGG CAGGCGGCCG
ACCCAGCTCG CGTACGCCCG TGCGGGTGTG GTGACGCCGG AGATGGAGTT CGTGGCGATC
CGGGAAGGAC TGGCGCCGGA CCTGGTTCGG GAGGAGATCG CGGCCGGTCG AGCCGTGCTG
CCGCTGAACG TCAACCACCC GGAGTGTGAG CCGGCGATCA TCGGCAAGGC GTTCCTGGTG
AAGATCAATG CGAACATCGG TACGTCGGCG GTCACGTCGT CGGTCGCCGA GGAGGTGGAG
AAGCTGACCT GGGCGACCCG GTGGGGCGCG GACGCGGTGA TGGACCTGTC GACCGGCAAG
CGGATTCACG AGACGCGCGA GGCGGTCGTG CGGAACTCGC CGGTGCCGAT CGGTACCGTA
CCGATCTACC AGGCGTTGGA GAAGGTCGGC GGTGATCCAG CGAAGCTGAG CTGGGAGGTT
TTCCGGGAGA CCGTCATCGA GCAGGCCGAG CAGGGCGTCG ACTACATGAC GGTGCACGCC
GGAGTGCTGC TGTCGTACGT GCCGCTCGCC GTCGAGCGGG TGACGGGGAT CGTTTCCCGG
GGCGGTTCGA TCATGGCAGC GTGGTGCCTG GCCCACCACG AGGAGAACTT CCTCTACACG
AACTTCCGGG AGCTGTGCGA GATCCTGGCT CGATACGACG TGACGTTCTC GCTCGGCGAC
GGGCTGCGTC CCGGCTCGAT CGCGGACGCC AACGACGAGG CGCAGTTCGC CGAGTTGAGG
ACTCTCGGCG AGTTGACGAA GGTCGCCTGG GAACACGACG TCCAGGTGAT GATCGAGGGC
CCCGGGCACG TGCCGATGCA CAAGATCAAG GAAAATGTTG ACCTCCAGCA GGAGTGGTGC
CACGAGGCGC CGTTCTATAC GCTCGGCCCG CTGACCACGG ACATCGCGCC CGCATACGAC
CACATCACCT CGGCCATCGG TGCGGCCATC ATCGGAATGT TCGGAACGGC GATGCTCTGC
TACGTCACTC CGAAGGAGCA CCTCGGGTTG CCGGATCGGG ACGATGTGAA GGCCGGCGTC
ATCGCGTACA AGATTGCCGC GCACGCCGCG GACCTGGCCA AGGGACACCC GGGGGCGCAG
GCGTGGGACG ACGCGCTCTC CAAGGCGCGG TTCGAGTTCC GCTGGGAGGA CCAGTTCAAC
CTCGCGTTGG ACCCGGAGAC GGCCCGCGCG TACCACGACG CCACGTTGCC CGCTGAGCCG
GCGAAGACGG CCCACTTCTG TTCGATGTGC GGCCCGAAGT TCTGCTCCAT GAAGATCACC
CAGGAACTCA AGGAGTACGC GGCGCGCGGC ATGAAGGACA AGTCGGAGGA GTTCGTCGCT
TCCGGTGGTC GCGTCTACCT ACCGCTGGCC TGA
 
Protein sequence
MQKRRKVYVE GSRPDIQVPF AEIDLTGDNP PVRLYDTSGP GSDPEAGLPP LRGRWIASRG 
DVAPVRGAGT PLAGVDGRRP TQLAYARAGV VTPEMEFVAI REGLAPDLVR EEIAAGRAVL
PLNVNHPECE PAIIGKAFLV KINANIGTSA VTSSVAEEVE KLTWATRWGA DAVMDLSTGK
RIHETREAVV RNSPVPIGTV PIYQALEKVG GDPAKLSWEV FRETVIEQAE QGVDYMTVHA
GVLLSYVPLA VERVTGIVSR GGSIMAAWCL AHHEENFLYT NFRELCEILA RYDVTFSLGD
GLRPGSIADA NDEAQFAELR TLGELTKVAW EHDVQVMIEG PGHVPMHKIK ENVDLQQEWC
HEAPFYTLGP LTTDIAPAYD HITSAIGAAI IGMFGTAMLC YVTPKEHLGL PDRDDVKAGV
IAYKIAAHAA DLAKGHPGAQ AWDDALSKAR FEFRWEDQFN LALDPETARA YHDATLPAEP
AKTAHFCSMC GPKFCSMKIT QELKEYAARG MKDKSEEFVA SGGRVYLPLA