Gene Cag_1268 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagCag_1268 
SymbolthiH 
ID3748306 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameChlorobium chlorochromatii CaD3 
KingdomBacteria 
Replicon accessionNC_007514 
Strand
Start bp1733993 
End bp1735063 
Gene Length1071 bp 
Protein Length356 aa 
Translation table11 
GC content52% 
IMG OID637773806 
Productthiamine biosynthesis protein ThiH 
Protein accessionYP_379572 
Protein GI78189234 
COG category[H] Coenzyme transport and metabolism
[R] General function prediction only 
COG ID[COG1060] Thiamine biosynthesis enzyme ThiH and related uncharacterized enzymes 
TIGRFAM ID[TIGR02351] thiazole biosynthesis protein ThiH 


Plasmid Coverage information

Num covering plasmid clones38 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clonesn/a 
Fosmid unclonability p-valuen/a 
Fosmid Hitchhikern/a 
Fosmid clonabilityn/a 
 

Sequence

Gene sequence
ATGGCAGAAA TTCCAGCTTG GCTCCATACC ACAAACGATG CAAACGCCCT TGCTTCTTTA 
CTTGCACCCA ATGCAACACG ATCGCTTGAA TCGCTTGCAG CAGAAGCATC GGCTATCACA
CGCCGCCGTT TTGGACGCAC CATAACGCTC TACGCGCCGC TTTACCTTTC AAACCATTGC
TCTAACGGCT GCGCTTATTG CGGCTTTGCT TCTGACCGAA CAACGCCGCG CCGCCGCCTT
GAAATGGAGG AGATTCGCCG CGAAATAGCA GCTATGAAAG CACTCGGCAT TAGCGATATT
TTGCTCTTGA CGGGAGAGCG CACACCTGCG GCTGATTTCG ACTATTTGCG CCAAAGCGTA
GCACTTGCGG CTGAAGAAAT GCAGCGCGTT GCCGTTGAAG CCTTCCCCAT GAGCGTAGCC
GAATATCGAG CCTTAGCGGA GAGCGGCTGC ACCAGCGTTA CCATTTACCA AGAGACCTAC
AATCGCAAAC AATACGAAGC GCTTCACCGC TGGGGAGCAA AAAAAGATTT TCTCTATCGG
CTTGAAACGC CTGCCCGCGC ACTTGAAGCC GGCATTAAGC ATGTAGGGCT TGGCGTACTC
TTGGGACTTT CCGATCCAAT AGAAGATGCC CTTTGCCTCT ACCGCCATGT GCGCCATCTT
GAACGGCGCT ACTGGCGAGC TGGATTTTCC ATCTCCTTTC CCCGCTTGCG CCCCGAAAGC
GGCGGCTATC AACCACCATT TCCTGTTGAC GATCGCCAAC TTGCCCGCCT GATTATGGCG
TTCCGCATTG CACTGCCAAA CATCGAATTA GTACTTTCCA CCCGCGAAAG TGCTCGCTTT
CGCGATGGCA TGGCAACCCT CGGCATTACT CGCATGAGCG TTGAAAGCCG CACCACTGTT
GGAGGCTATG CAGAAAACGA AACCATTAAA AGCAGTGCAG GACAGTTTGA AATTTGCGAT
GACCGCAACG TTGAAGAGTT TTGTGCCGCT TTACGAACAC AGCAGATTGA GCCAATTTTT
AAGAATTGGG AACGCGCTTA CAATGCGCCA TCAATGAGCT GCTTTTTATA A
 
Protein sequence
MAEIPAWLHT TNDANALASL LAPNATRSLE SLAAEASAIT RRRFGRTITL YAPLYLSNHC 
SNGCAYCGFA SDRTTPRRRL EMEEIRREIA AMKALGISDI LLLTGERTPA ADFDYLRQSV
ALAAEEMQRV AVEAFPMSVA EYRALAESGC TSVTIYQETY NRKQYEALHR WGAKKDFLYR
LETPARALEA GIKHVGLGVL LGLSDPIEDA LCLYRHVRHL ERRYWRAGFS ISFPRLRPES
GGYQPPFPVD DRQLARLIMA FRIALPNIEL VLSTRESARF RDGMATLGIT RMSVESRTTV
GGYAENETIK SSAGQFEICD DRNVEEFCAA LRTQQIEPIF KNWERAYNAP SMSCFL