Gene Cthe_3149 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagCthe_3149 
Symbol 
ID4809712 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameClostridium thermocellum ATCC 27405 
KingdomBacteria 
Replicon accessionNC_009012 
Strand
Start bp3721116 
End bp3722546 
Gene Length1431 bp 
Protein Length476 aa 
Translation table11 
GC content43% 
IMG OID640108582 
Productaminoacyl-histidine dipeptidase 
Protein accessionYP_001039537 
Protein GI125975627 
COG category[E] Amino acid transport and metabolism 
COG ID[COG2195] Di- and tripeptidases 
TIGRFAM ID[TIGR01893] aminoacyl-histidine dipeptidase 


Plasmid Coverage information

Num covering plasmid clones14 
Plasmid unclonability p-value0.35499 
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clonesn/a 
Fosmid unclonability p-valuen/a 
Fosmid Hitchhikern/a 
Fosmid clonabilityn/a 
 

Sequence

Gene sequence
ATGTCAACAA TTCACACAAT CGAGCCGCGC AAAGTGTTTC ACTGGTTTTA TCAAATTAAC 
CAGATTCCGC GATGTTCCGG CAATGAAAAA AGAATCAGCG ATTTTTTGGT GAATTTCGCC
AGAGAAAGAA ATCTGGAAGT TTATCAGGAC GAACTTTACA ACGTAATCAT CAAAAAGCCT
GCAACTCCGG GCTATGAAAA TGCGCCTGCC GTCATTATAC AAGGCCACAG TGACATGGTC
TGCATAAAAG GTGAAGGTTC CAACCATAAT TTTGACACGG ATCCTATCGA AATGATTGTG
GAAGGTGACA TTTTAAGAGC GAACAACACA ACCCTTGGCG GAGACGATGG GATTGCCGTT
GCTTACGGTT TGGCAATTTT GGATTCCGAT GATTTAAAAC ATCCTGCCAT TGAACTTTTG
GTCACGACGA GGGAAGAGAC GGGCATGGAC GGGGCGATGG CTCTGACCGG TGAACATTTA
AGCGGGAAAA TACTGCTTAA CATTGATTCA GACGAGGAAG GTGTTTTTTT AGTCAGCTGT
GCCGGCGGTG CAAACCAGAT TGTTACCTTC CCGTTAAAAA AGGAGAAGAA AAGAGGCACG
GGCCTTAAAA TTAAAGTTTC CGGCCTTAAA GGCGGCCACT CCGGAATGGA GATTGTCAAG
CAAAGGGCAA ACGCAATTAA ACTTTTGGCC CGTATTTTGG ACCAATGCAG GGACAAGGTT
ACTTTGGCAA AGATTACGGG TGGCAGCAAA CATAATGCAA TTCCAAAGGA AGCGGAAGCG
GTTGTTTTGA CAGAAGATTT GGAAGGCACG GTACGTATAG TAGAGTCCCT GGCAAAGGAA
TTGAAAGAAG AATACCGGGT GGAAGACAGC GGACTTACTG TTACCGTAAC GGAAGTTGGA
GTTGAAGAAG TTTTCTTAAA GCAAATATCC AACGATGTAA TTGATTTTCT GATGATGACG
CCGGACGGCG TTCAGTATAT GTCAAAGGAT ATCGAAGGTT TGGTTCAGAC GAGTGTCAAC
AATGCCGTGG TGGAAGAAAA AGAAGGGCGG CTCGTTGTTA CAATATCTCT CCGTTCTTCA
TCGGAAAGTT CTTTAAGAGA AATGTTAAAC CGTGTAGCGC TGATTGCAAA AAGGACAAAC
GGGATGGCCA AAGAAAGCAA TTTTTATCCG GCATGGGAGT ATGATGACAA GTCTGAAATA
AGAAAAACGG CAGTCAGGGT TTATGAAGAG ATGTTTGACA AAAAAGCAAA ATTGACTGCG
GTTCATGCGG GACTTGAATG CGGTGTGCTC AAGAAAAAAT TGCCTGATGT TGATATGATT
AGTTTCGGGC CAAATTTATA TGACATTCAT ACGGAAAAAG AGCATCTTAG CATCTCATCG
GTGGAAAGAG TTTGGAGATT TCTAATTCGT TTGCTTGAAG AGATAAAATA A
 
Protein sequence
MSTIHTIEPR KVFHWFYQIN QIPRCSGNEK RISDFLVNFA RERNLEVYQD ELYNVIIKKP 
ATPGYENAPA VIIQGHSDMV CIKGEGSNHN FDTDPIEMIV EGDILRANNT TLGGDDGIAV
AYGLAILDSD DLKHPAIELL VTTREETGMD GAMALTGEHL SGKILLNIDS DEEGVFLVSC
AGGANQIVTF PLKKEKKRGT GLKIKVSGLK GGHSGMEIVK QRANAIKLLA RILDQCRDKV
TLAKITGGSK HNAIPKEAEA VVLTEDLEGT VRIVESLAKE LKEEYRVEDS GLTVTVTEVG
VEEVFLKQIS NDVIDFLMMT PDGVQYMSKD IEGLVQTSVN NAVVEEKEGR LVVTISLRSS
SESSLREMLN RVALIAKRTN GMAKESNFYP AWEYDDKSEI RKTAVRVYEE MFDKKAKLTA
VHAGLECGVL KKKLPDVDMI SFGPNLYDIH TEKEHLSISS VERVWRFLIR LLEEIK