Gene Cthe_1008 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagCthe_1008 
Symbol 
ID4811302 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameClostridium thermocellum ATCC 27405 
KingdomBacteria 
Replicon accessionNC_009012 
Strand
Start bp1205397 
End bp1206518 
Gene Length1122 bp 
Protein Length373 aa 
Translation table11 
GC content40% 
IMG OID640106426 
Productaminodeoxychorismate lyase 
Protein accessionYP_001037433 
Protein GI125973523 
COG category[R] General function prediction only 
COG ID[COG1559] Predicted periplasmic solute-binding protein 
TIGRFAM ID[TIGR00247] conserved hypothetical protein, YceG family 


Plasmid Coverage information

Num covering plasmid clones10 
Plasmid unclonability p-value0.00821655 
Plasmid hitchhikingYes 
Plasmid clonabilityhitchhiker 
 

Fosmid Coverage information

Num covering fosmid clonesn/a 
Fosmid unclonability p-valuen/a 
Fosmid Hitchhikern/a 
Fosmid clonabilityn/a 
 

Sequence

Gene sequence
ATGAGCGGAA ACGGTGCGGC AGTATCTAAA AACAAAACAA AAAAGCGAAA AAACAGACTC 
GCGTCTTTAT TTTTGTACTT TCTTGTTTTC CTTATTATAT TTACAGTCAG CACCTTGGCT
TCTTATACAT ACTTCATAAA TGAGAAAGAG ATCAATTATG AAGAAGTTAT GGCAAAAATA
GACCCCGAAA ACGGTATTCA GGTTGAAATC CCCCGGGGAG CCAATACGGA CGACATTGCA
AACATCCTCA GGGAGCACGG AGTAATAAAA TATCCTTTTT GGTTTAAGTT TGTTTCCAAA
TTCAACGGCT ATGACGGCCG TTACAAATCA GGAAAACATA TTGTGAACAA AGACCTTAAA
TATAAAGAAA TTATGGAAAT ACTCTGCAGC AACCCTGTTA CAACCACCGT TACCATAATC
GAGGGTAAAA ATACGGATCA AATTGCTGAT ATTCTAAGCG AAAAAAAGGT TATCGACAAG
GAGGCCTTTC TTGAAGCCTG CAATACCGAA AAATTTGATT ATGAATTTTT AAAAGACATT
CCGGAAAATC CTCAAAGAGA AAACAAACTT GAGGGATACC TTTTCCCGGA TACCTATTTC
TTTGACCCAA AAGCAGGAGA ACGGGCAATA ATTGAAAAGT TTTTAGATAA CTTTGATGCA
AAATTCAAGC CGGAATTTTA TGAAAGAGCC AAAGAGCTGA ACATGACGGT GGACGAGGTA
ATTATTCTTG CCTCCATAAT AGAAAGGGAA ACAGCTCTCC CCGAAGAAAG GCCAATTGTT
TCCAGCGTAT TTCACAACAG GCTGAAGTCT TCGGACCCCA ATCTAAAAAA GCTTGAATCC
TGTGCCACCG TACAATATGT TTTGTACAAA ACTCAGGGAA AAATGAAGGA AAAGTTGTCC
GACGAGGATA CAAAAATAGA CCACCCGTAC AACACATACC TTTATGAGGG GCTTCCGCCG
GGACCCATAT GCTGTCCTGG TCTTGCCTCC ATAGAAGCTG CATTGTATCC GGACGAAGAA
TCAGAGTATC TGTACTTTGT GGCAAAAGGA GACGGAAGTC ACGAATTTTC AAGAACTTTG
GCCGAACATT TGGAAGCTGT TAAAAAGTAT CAATCCAATT GA
 
Protein sequence
MSGNGAAVSK NKTKKRKNRL ASLFLYFLVF LIIFTVSTLA SYTYFINEKE INYEEVMAKI 
DPENGIQVEI PRGANTDDIA NILREHGVIK YPFWFKFVSK FNGYDGRYKS GKHIVNKDLK
YKEIMEILCS NPVTTTVTII EGKNTDQIAD ILSEKKVIDK EAFLEACNTE KFDYEFLKDI
PENPQRENKL EGYLFPDTYF FDPKAGERAI IEKFLDNFDA KFKPEFYERA KELNMTVDEV
IILASIIERE TALPEERPIV SSVFHNRLKS SDPNLKKLES CATVQYVLYK TQGKMKEKLS
DEDTKIDHPY NTYLYEGLPP GPICCPGLAS IEAALYPDEE SEYLYFVAKG DGSHEFSRTL
AEHLEAVKKY QSN