Gene EcSMS35_1868 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagEcSMS35_1868 
SymboltrpE 
ID6145204 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameEscherichia coli SMS-3-5 
KingdomBacteria 
Replicon accessionNC_010498 
Strand
Start bp1891466 
End bp1893028 
Gene Length1563 bp 
Protein Length520 aa 
Translation table11 
GC content54% 
IMG OID641616744 
Productanthranilate synthase component I 
Protein accessionYP_001743922 
Protein GI170680165 
COG category[E] Amino acid transport and metabolism
[H] Coenzyme transport and metabolism 
COG ID[COG0147] Anthranilate/para-aminobenzoate synthases component I 
TIGRFAM ID[TIGR00565] anthranilate synthase component I, proteobacterial subset 


Plasmid Coverage information

Num covering plasmid clones25 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones11 
Fosmid unclonability p-value0.00000000112472 
Fosmid HitchhikerYes 
Fosmid clonabilityhitchhiker 
 

Sequence

Gene sequence
ATGCAAACAC AAAAACCGAC TCTCGAACTG CTAACTTGCA AAGGCGCTTA TCGCGACAAC 
CCGACTGCGC TTTTTCACCA GTTGTGTGGG GATCGTCCGG CAACGCTGCT GCTGGAATCC
GCAGATATCG ACAGCAAAGA TGATTTAAAA AGTCTATTGC TAGTGGACAG TGCGCTGCGC
ATTACAGCAT TAGGTGACAC TGTCACTATT CAGGCACTTT CCGGCAATGG CGAAGCCCTG
TTGGCATTAC TGGATAACGC CCTGCCTGCG GGTGTGGAAA ATGAACAATC ACCAAACTGC
CGCGTGCTGC GCTTCCCTCC TGTCAGTCCA TTGCTGGATG AAGACGCCCG CTTATGCTCC
CTTTCGGTTT TTGACGCTTT CCGCTTATTG CAGAATCTGT TGAATGTACC GAAGGAAGAA
CGGGAAGCGA TGTTCTTTGG TGGCCTGTTC TCTTACGACC TGGTCGCAGG ATTTGAAGAT
TTACCGCAAC TGGCAGCGGA AAATAACTGC CCTGATTTCT GTTTTTATCT CGCTGAAACG
CTGATGGTGA TTGACCATCA GAAAAAAAGC ACCCGTATTC AGGCCAGCCT GTTTGCTCCC
AATGAAAAAG AAAAACAACG TCTCACTGCT CGCCTGAACG AACTTCGCCA GCAACTGACC
GAAACCGCGC CACCGCTGCC GGTGGTTTCC GTGCCGCATA TGCGTTGTGA ATGTAATCAG
AGCGATGAAG AGTTCGGTGG CGTGGTGCGT TCGTTGCAAA AAGCGATTCG CGCCGGAGAA
ATTTTCCAGG TGGTGCCATC TCGCCGTTTC TCTCTACCCT GCCCGTCACC GCTGGCGGCT
TATTACGTGC TGAAAAAGAG TAATCCCAGC CCGTACATGT TTTTTATGCA GGATAATGAT
TTCACCCTGT TTGGCGCGTC GCCGGAAAGT TCGCTCAAGT ATGACGCCAC CAGCCGCCAG
ATTGAGATCT ACCCGATTGC CGGAACACGC CCACGCGGTC GTCGTGCCGA TGGTTCACTG
GACAGAGACC TCGACAGCCG CATCGAACTC GAAATGCGTA CCGACCATAA AGAGCTTTCT
GAACATCTGA TGCTGGTGGA TCTCGCCCGT AATGACCTGG CTCGCATTTG CACACCCGGC
AGCCGCTACG TTGCCGATCT CACCAAAGTT GACCGTTACT CTTACGTGAT GCACCTGGTC
TCCCGCGTTG TTGGTGAGCT GCGCCACGAT CTCGACGCCC TGCACGCTTA CCGCGCCTGT
ATGAATATGG GGACGTTAAG CGGTGCGCCG AAAGTACGCG CTATGCAGTT AATTGCCGAA
GCAGAAGGTC GTCGACGCGG CAGCTACGGC GGCGCGGTAG GTTATTTTAC CGCGCATGGC
GATCTCGACA CCTGCATTGT GATCCGCTCG GCGCTGGTGG AAAACGGTAT CGCCACCGTG
CAAGCCGGTG CTGGCGTAGT CCTTGATTCT GTTCCGCAGT CGGAAGCCGA CGAAACCCGT
AATAAAGCCC GCGCTGTACT GCGCGCTATT GCCACCGCGC ATCATGCACA GGAGACCTTC
TGA
 
Protein sequence
MQTQKPTLEL LTCKGAYRDN PTALFHQLCG DRPATLLLES ADIDSKDDLK SLLLVDSALR 
ITALGDTVTI QALSGNGEAL LALLDNALPA GVENEQSPNC RVLRFPPVSP LLDEDARLCS
LSVFDAFRLL QNLLNVPKEE REAMFFGGLF SYDLVAGFED LPQLAAENNC PDFCFYLAET
LMVIDHQKKS TRIQASLFAP NEKEKQRLTA RLNELRQQLT ETAPPLPVVS VPHMRCECNQ
SDEEFGGVVR SLQKAIRAGE IFQVVPSRRF SLPCPSPLAA YYVLKKSNPS PYMFFMQDND
FTLFGASPES SLKYDATSRQ IEIYPIAGTR PRGRRADGSL DRDLDSRIEL EMRTDHKELS
EHLMLVDLAR NDLARICTPG SRYVADLTKV DRYSYVMHLV SRVVGELRHD LDALHAYRAC
MNMGTLSGAP KVRAMQLIAE AEGRRRGSYG GAVGYFTAHG DLDTCIVIRS ALVENGIATV
QAGAGVVLDS VPQSEADETR NKARAVLRAI ATAHHAQETF