Gene EcSMS35_1870 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagEcSMS35_1870 
SymboltrpC 
ID6143856 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameEscherichia coli SMS-3-5 
KingdomBacteria 
Replicon accessionNC_010498 
Strand
Start bp1894624 
End bp1895985 
Gene Length1362 bp 
Protein Length453 aa 
Translation table11 
GC content53% 
IMG OID641616746 
Productbifunctional indole-3-glycerol phosphate synthase/phosphoribosylanthranilate isomerase 
Protein accessionYP_001743924 
Protein GI170683763 
COG category[E] Amino acid transport and metabolism 
COG ID[COG0134] Indole-3-glycerol phosphate synthase
[COG0135] Phosphoribosylanthranilate isomerase 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clones27 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones15 
Fosmid unclonability p-value0.000000113165 
Fosmid HitchhikerYes 
Fosmid clonabilityhitchhiker 
 

Sequence

Gene sequence
ATGATGCAAA CCGTTTTAGC GAAAATCGTC GCAGACAAGG CGATTTGGGT AGAAACCCGC 
AAAGAGCAGC AACCGCTGGC CAGTTTTCAG AATGAGGTTC AGCCGAGCAC GCGACATTTT
TATGATGCAC TTCAGGGCGC ACGCACGGCG TTTATTCTGG AGTGTAAAAA AGCGTCGCCG
TCAAAAGGCG TGATCCGTGA TGATTTCGAT CCGGCACGCA TTGCAACCGT CTACAGAAAT
TACGCTTCGG CAATTTCGGT GCTGACTGAC GAGAAATATT TTCAGGGGAG CTTTGATTTC
CTCCCCATTG TCAGCCAAAT CGCCCCACAG CCGATTTTAT GTAAAGACTT TATTATCGAC
CCTTACCAGA TCTATCTGGC GCGCTATTAC CAGGCTGATG CCTGCTTATT AATGCTTTCA
GTACTGGATG ACGAACAATA TCGCCAGCTT GCCGCCGTCG CCCACAGTCT GGAGATGGGT
GTGCTGACCG AAGTCAGTAA TGAAGAGGAA CTGGAGCGCG CCATTGCATT GGGGGCAAAG
GTCGTTGGCA TCAACAACCG CGATCTGCGC GATTTGTCGA TTGATCTCAA CCGTACCCGC
GAGCTTGCAC CGAAACTGGG GCACAACGTG ACGGTAATCA GCGAATCCGG CATCAATACT
TACGCTCAGG TGCGCGAGTT AAGCCACTTC GCTAACGGTT TTCTGATTGG TTCGGCGTTG
ATGGCCCATG ACGATTTGAA CGCCGCCGTG CGCCGGGTGT TGCTGGGTGA GAATAAAGTG
TGTGGCCTGA CGCGTGGGCA AGATGCTAAA GCAGCTTATG ACGCGGGCGC GATTTACGGT
GGGTTGATTT TTGTTGCGAC ATCACCGCGT TGCGTCAACG TTGAACAGGC GCAGGAAGTG
ATGGCTGCGG CACCGTTGCA GTATGTTGGC GTGTTCCGCA ATCACGATAT TGCCGATGTG
GTGGACAAAG CTAAGGTGTT ATCGCTGGCG GCAGTGCAAC TGCATGGTAA AGAAGATCAG
CTGTATATCG ACACTCTGCG TGAGGCTCTG CCAGCACACG TCGCCATCTG GAAGGCTTTA
AGTGTCGGTG AAACTCTTCC CGCGCGCGAT TTTCAGCACA TCGATAAATA TGTATTCGAC
AACGGTCAGG GCGGGAGCGG ACAACGTTTC GACTGGTCAC TATTAAATGG TCAATCGCTT
GGCAACGTTC TGCTGGCGGG GGGCTTAGGC GCAGATAACT GCGTGGAAGC GGCACAAACC
GGCTGCGCCG GACTTGATTT TAATTCTGCT GTAGAGTCGC AACCGGGCAT CAAAGACGCA
CGTCTTTTGG CCTCGGTTTT CCAGACGCTG CGCGCATATT AA
 
Protein sequence
MMQTVLAKIV ADKAIWVETR KEQQPLASFQ NEVQPSTRHF YDALQGARTA FILECKKASP 
SKGVIRDDFD PARIATVYRN YASAISVLTD EKYFQGSFDF LPIVSQIAPQ PILCKDFIID
PYQIYLARYY QADACLLMLS VLDDEQYRQL AAVAHSLEMG VLTEVSNEEE LERAIALGAK
VVGINNRDLR DLSIDLNRTR ELAPKLGHNV TVISESGINT YAQVRELSHF ANGFLIGSAL
MAHDDLNAAV RRVLLGENKV CGLTRGQDAK AAYDAGAIYG GLIFVATSPR CVNVEQAQEV
MAAAPLQYVG VFRNHDIADV VDKAKVLSLA AVQLHGKEDQ LYIDTLREAL PAHVAIWKAL
SVGETLPARD FQHIDKYVFD NGQGGSGQRF DWSLLNGQSL GNVLLAGGLG ADNCVEAAQT
GCAGLDFNSA VESQPGIKDA RLLASVFQTL RAY