Gene EcSMS35_4442 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagEcSMS35_4442 
SymbolthiC 
ID6144890 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameEscherichia coli SMS-3-5 
KingdomBacteria 
Replicon accessionNC_010498 
Strand
Start bp4538735 
End bp4540630 
Gene Length1896 bp 
Protein Length631 aa 
Translation table11 
GC content56% 
IMG OID641619262 
Productthiamine biosynthesis protein ThiC 
Protein accessionYP_001746378 
Protein GI170680038 
COG category[H] Coenzyme transport and metabolism 
COG ID[COG0422] Thiamine biosynthesis protein ThiC 
TIGRFAM ID[TIGR00190] thiamine biosynthesis protein ThiC 


Plasmid Coverage information

Num covering plasmid clones21 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones26 
Fosmid unclonability p-value0.000546852 
Fosmid HitchhikerYes 
Fosmid clonabilityhitchhiker 
 

Sequence

Gene sequence
ATGTCTGCAA CAAAACTGAC CCGCCGCGAA CAACGCGCCC AGGCCCAACA TTTTATCGAC 
ACCCTGGAAG GCACCGCCTT TCCAAACTCA AAACGCATTT ACATCACCGG CACACAACCC
GGCGTGCGCG TGCCGATGCG TGAGATCCAG CTTAGCCCGA CGCTAATCGG CGGTAGCAAA
GAACAGCCGC AGTACGAAGA AAACGAAGCG ATTCCGGTCT ACGACGCCTC CGGCCCGTAT
GGCGATCCGC AGATTGCCAT TAACGTGCAG CAAGGGCTGG CAAAACTACG CCAGCCGTGG
ATCGATGCGC GCGGCGATAC CGAAGAACTT ACCGTGCGCA GTTCCGATTA CACTAAAGCG
CGACTGGCAG ATGATGGCCT CGACGAGCTA CGTTTTAGCG GCGTATTAAC CCCAAAGCGC
GCCAAAGCAG GACGCCGCGT CACCCAACTG CACTACGCCC GCCGGGGCAT CATCACGCCC
GAAATGGAAT TCATCGCCAT TCGCGAGAAT ATGGGTCGCG AGCGCATACG TAGCGAAGTT
TTGCGCCACC AGCATCCGGG AATGAGCTTT GGCGCACGTC TGCCGGAAAA TATCACTGCG
GAATTTGTCC GTGATGAAGT TGCTGCCGGA CGTGCGATTA TCCCGGCCAA CATTAATCAT
CCGGAATCGG AGCCGATGAT TATTGGTCGC AATTTCCTGG TAAAAGTTAA CGCCAATATC
GGCAACTCGG CGGTCACCTC TTCCATCGAA GAAGAAGTGG AAAAGCTGGT ATGGTCCACG
CGCTGGGGAG CGGATACGGT GATGGATCTC TCCACCGGTC GCTATATTCA CGAAACCCGC
GAGTGGATTT TGCGTAACAG CCCGGTGCCG ATCGGTACAG TGCCGATCTA CCAGGCGCTG
GAGAAGGTTA ACGGGATCGC CGAAGATCTT ACCTGGGAAG CGTTCCGCGA CACGCTGCTG
GAACAAGCCG AGCAAGGTGT GGATTACTTC ACTATCCATG CGGGCGTACT GCTGCGCTAT
GTGCCGATGA CCGCGAAACG CCTGACTGGT ATCGTCTCTC GCGGCGGTTC GATTATGGCG
AAATGGTGCC TCTCCCATCA TCAGGAAAAT TTCCTCTATC AACACTTCCG CGAAATTTGT
GAAATCTGTG CCGCTTATGA CGTTTCGCTG TCGCTGGGCG ACGGTCTGCG CCCCGGTTCT
GTTCAGGACG CCAACGATGA AGCGCAATTT GCCGAGCTGC ATACGCTGGG TGAACTGACC
AAAATCGCCT GGGAATATGA CGTACAGGTG ATGATTGAAG GCCCAGGCCA CGTGCCGATG
CAGATGATCC GCCGCAATAT GACCGAGGAG TTAGAGCAAT GCCACGAAGC GCCGTTTTAC
ACTCTGGGGC CGCTAACTAC CGATATCGCG CCGGGTTATG ACCACTTCAC GTCGGGGATT
GGTGCGGCGA TGATTGGCTG GTTTGGCTGC GCGATGCTCT GTTACGTAAC GCCGAAAGAG
CATCTGGGCC TGCCAAATAA AGAAGATGTT AAACAAGGGC TTATCACCTA TAAGATTGCC
GCTCACGCCG CCGACCTGGC GAAAGGGCAT CCGGGCGCGC AAATTCGCGA TAACGCCATG
TCGAAAGCCC GCTTCGAATT TCGCTGGGAA GACCAGTTTA ATCTGGCCCT CGACCCGTTT
ACTGCCCGCG CTTATCACGA TGAAACCCTG CCGCAAGAGT CCGGTAAAGT CGCCCATTTT
TGCTCCATGT GTGGACCGAA ATTCTGCTCG ATGAAAATCA GCCAGGAAGT GCGTGATTAC
GCCGCCGCGC AAACCATTGA AGTGGGAATG GCGGATATGT CGGAGAACTT CCGCGCCAGA
GGCGGAGAAA TCTACCTGCG TAAGGAGGAA GCGTGA
 
Protein sequence
MSATKLTRRE QRAQAQHFID TLEGTAFPNS KRIYITGTQP GVRVPMREIQ LSPTLIGGSK 
EQPQYEENEA IPVYDASGPY GDPQIAINVQ QGLAKLRQPW IDARGDTEEL TVRSSDYTKA
RLADDGLDEL RFSGVLTPKR AKAGRRVTQL HYARRGIITP EMEFIAIREN MGRERIRSEV
LRHQHPGMSF GARLPENITA EFVRDEVAAG RAIIPANINH PESEPMIIGR NFLVKVNANI
GNSAVTSSIE EEVEKLVWST RWGADTVMDL STGRYIHETR EWILRNSPVP IGTVPIYQAL
EKVNGIAEDL TWEAFRDTLL EQAEQGVDYF TIHAGVLLRY VPMTAKRLTG IVSRGGSIMA
KWCLSHHQEN FLYQHFREIC EICAAYDVSL SLGDGLRPGS VQDANDEAQF AELHTLGELT
KIAWEYDVQV MIEGPGHVPM QMIRRNMTEE LEQCHEAPFY TLGPLTTDIA PGYDHFTSGI
GAAMIGWFGC AMLCYVTPKE HLGLPNKEDV KQGLITYKIA AHAADLAKGH PGAQIRDNAM
SKARFEFRWE DQFNLALDPF TARAYHDETL PQESGKVAHF CSMCGPKFCS MKISQEVRDY
AAAQTIEVGM ADMSENFRAR GGEIYLRKEE A