Gene Franean1_1837 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagFranean1_1837 
SymbolthiH 
ID5670239 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameFrankia sp. EAN1pec 
KingdomBacteria 
Replicon accessionNC_009921 
Strand
Start bp2206156 
End bp2207322 
Gene Length1167 bp 
Protein Length388 aa 
Translation table11 
GC content75% 
IMG OID641240758 
Productthiamine biosynthesis protein ThiH 
Protein accessionYP_001506181 
Protein GI158313673 
COG category[H] Coenzyme transport and metabolism
[R] General function prediction only 
COG ID[COG1060] Thiamine biosynthesis enzyme ThiH and related uncharacterized enzymes 
TIGRFAM ID[TIGR02351] thiazole biosynthesis protein ThiH 


Plasmid Coverage information

Num covering plasmid clones10 
Plasmid unclonability p-value0.151757 
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones
Fosmid unclonability p-value0.173638 
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGGCCAGCC CGGCAGGGCT GTTCGCCCGC GAGCTCGCCG CGCTCGACAT CCCGGCGCTC 
GCCCGTGTCT CGGTCGAGGC CGACGAGGCG CGGGTCGACG CCGTCCTGCG CCGGGCCGTC
GCTGCCGGGC GGCCCGACGC CGGCGGCCGG CTCGACCTCG CCGACCTCGC CGTCCTGCTG
TCGCCGGCCG CGACCGGGCG GCTGGAGGAG CTGGCGCAGG CGGCGCGGGA GACGACGCTG
CGCCGGTTCG GCCGGGCGGT GCGGCTGTTC GCCCCGCTGT ACGTGTCGAA CGCCTGCCTG
TCGTCCTGCA CCTACTGCGG GTTCGCCAAG GGGCTGGAGG TGGCCCGGCG CACCCTGACG
GTCGACGAAG CCGAGGCCGA GGCACGCCTG CTGGCCGACC GCGGCTTCCG GCACATCCTG
CTGGTCTCCG GGGAGCACCG CGTCGAGGTC TCCGCCGGGT ACCTGGTGGA CGTCGTCGAG
CGGCTGCGAC CGTTCGTCCC CTCGATCTCG GTGGAGACCC AGACCTGGTC GGACGACACC
TACAGCCGGC TGGTCGTGGC CGGGCTCGAC GGCGTCGTCC ACTACCAGGA GACCTACGAC
CGGGAGCGCT ACGCGCAGGT GCACGTGGCC GGGTGGAAGC GCGACTACGA CCGCCGGCTG
TCCTCCTTCG AGCGGGCGGC CCGCGCCGGC GCCCGCCGTC TGGGCCTCGG CGTCCTGCTC
GGCCTGGCGC CGGACTGGCG GGCCGACGTC CTCGCGCTCG CCGCGCACGC CTCGTTCCTC
GCCCGCCGCT TCTGGCGGAC GGAGGTCTCG GTGGCGCTGC CGAGGATCAA GCCGAGCGCC
AGTGGCTTCC CACCGACCGT CGTCGTCGGC GACGCCGAGT TCGTCCAGGC GCACGCGGCG
CTGCGGCTGT TCGAACCGGA CGCGGCGATC TCGCTGTCGA CCCGCGAGCC GGCGGCCCTG
CGTGACGGCC TGGTCCGCAT CGCGGTGACC ACGATGAGCG CCGGCTCGTC CACCGAGCCA
GGTGGGTACG GGCGGCCCGG GACGGCGCAG GAGCAGTTCT CCATCTCCGA CGAGCGATCC
CCGGCGGACG TCGCCGCGAT GCTCGTCGGC GCCGGCTACG AGCCTGTCTG GAAGGACGCG
TTCCCGCTGG TCGACGCCGC CGGCTGA
 
Protein sequence
MASPAGLFAR ELAALDIPAL ARVSVEADEA RVDAVLRRAV AAGRPDAGGR LDLADLAVLL 
SPAATGRLEE LAQAARETTL RRFGRAVRLF APLYVSNACL SSCTYCGFAK GLEVARRTLT
VDEAEAEARL LADRGFRHIL LVSGEHRVEV SAGYLVDVVE RLRPFVPSIS VETQTWSDDT
YSRLVVAGLD GVVHYQETYD RERYAQVHVA GWKRDYDRRL SSFERAARAG ARRLGLGVLL
GLAPDWRADV LALAAHASFL ARRFWRTEVS VALPRIKPSA SGFPPTVVVG DAEFVQAHAA
LRLFEPDAAI SLSTREPAAL RDGLVRIAVT TMSAGSSTEP GGYGRPGTAQ EQFSISDERS
PADVAAMLVG AGYEPVWKDA FPLVDAAG