Gene EcSMS35_3997 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagEcSMS35_3997 
Symbol 
ID6142601 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameEscherichia coli SMS-3-5 
KingdomBacteria 
Replicon accessionNC_010498 
Strand
Start bp4075489 
End bp4077042 
Gene Length1554 bp 
Protein Length517 aa 
Translation table11 
GC content41% 
IMG OID641618822 
Productputative PTS regulatory protein 
Protein accessionYP_001745961 
Protein GI170680552 
COG category[K] Transcription 
COG ID[COG3711] Transcriptional antiterminator 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clones24 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones64 
Fosmid unclonability p-value
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGCAAATGA TTACCTCGCG ACAAAACAGA TTATTACGAT TCCTTCTACC ACGAAGGGAA 
TATACGACTA TTGTTACAAT TGCCGGCTAT TTAAATGTTT CGGAAAAAAC CATTCAACGT
GATTTACGTT TACTTGAGCA ATGGCTGGGG CAATGGAGAA TAAATGTTGA GAAGCGTGCT
GGCGCGGGTG TGATGTTAAG CGCGGAGAAT ATTGCTGATT TGCTGCATCT TGATCATTTG
TTGGGAGCAG AATGTGAAGA GATTGATGGT GTAATGAATA ATGCCAGGCG CGTTAAAATA
GCGTCGCAGT TATTAAGTGA AACACCGAAT GAAACGTCGA TCAGTAAATT GTCAGAACGC
TACTTTATCA GCGGAGCCTC TATTGTTAAT GACCTGAGAG TAATTGAGTC CTGGCTTGCG
CCGTTGGGGT TATCATTGAT CCGCAGCCCA AGTGGTACGC ATATTGAAGG TAGTGAAGGG
CAAGTCCGAC AGGCAATGGC ATTACTGATT AACGGCATTA TTAACCATAA TGAGCCGCAA
GGTGTCGTGT ATTCACGTCT GGATCCCGGA AGCTATAAAG CATTAGTCCA TTATTTTGGA
GAGGAAGAAG TGTTATTTGT CCAGTCATTG TTACTGGATA TGGAAAATGA ATTAAGTTGG
TCTTTGGGAG AACCTTATTA CGTTAACATT TTTACTCACA TCCTTATTAT GATGTATCGC
AACACGCACG GGAATGCGTT ATCAAGAGAA GAAGATCAAA CCAGGCAATA TGATGAAAAT
ATCTTTAATG TTGCCAGTCA GATGATTCAT AAGATAGAAC AACGAATTGC ACATACATTG
CCCGATGATG AAGTCTGGTT TATTTATCAA TATATCATTT CATCAGGTGT GGCGATTGAT
GGACAAAAAG ATGTGAGCAT TATTTCACAT ATGCAGGCCA GCAATGAAGC GCGTCTGATT
ACCTGGCGTT TAATTACGGT ATTCAGTGAC ATCGTGGACT GCGATTTTAG TGAAGACAGC
GCATTATATG ATGGCTTAAT GGTGCATATT AAACCGCTGA TTAACCGACT AAATTATCGT
ATTCATATCC GTAATCCATT GTTGGAAGAT ATTAAAGCAG AACTAGCGGA TGTCTGGCGG
TTGACGCAAT ATGTGGTGAA TCAGGTATTT AAAACCTGGG GTGAGAATGC AGTGAGCGAG
GATGAAGTGG GTTACCTGAC CGTTCATTTT CAGGCTGCGA TGGAGCGGCA AATTGCCCGT
AAACGTGTAT TACTGGTCTG TTCAACCGGA ATCGGAACTT CGCATCTACT GAAAAGCCGT
ATTCTGCGAG CATTTCCTGA ATGGACGATT GTTGATGTTA TTTCAGCAGC GAATTTATCA
CAGGTTTTGC CTGACAATAT CGAACTGATT ATTTCGACAA TTAATTTGCC TACAGTCACT
ATGCCGGTCG CTTATGTTAC CGCTTTTTTT AATGATGCCG ATATTAAGCG GGTCACTGAA
ATGGTGATTA CGGAAAAATT ACATCATGCG ACGTCTCGGG TCGTTGAAAT TTAA
 
Protein sequence
MQMITSRQNR LLRFLLPRRE YTTIVTIAGY LNVSEKTIQR DLRLLEQWLG QWRINVEKRA 
GAGVMLSAEN IADLLHLDHL LGAECEEIDG VMNNARRVKI ASQLLSETPN ETSISKLSER
YFISGASIVN DLRVIESWLA PLGLSLIRSP SGTHIEGSEG QVRQAMALLI NGIINHNEPQ
GVVYSRLDPG SYKALVHYFG EEEVLFVQSL LLDMENELSW SLGEPYYVNI FTHILIMMYR
NTHGNALSRE EDQTRQYDEN IFNVASQMIH KIEQRIAHTL PDDEVWFIYQ YIISSGVAID
GQKDVSIISH MQASNEARLI TWRLITVFSD IVDCDFSEDS ALYDGLMVHI KPLINRLNYR
IHIRNPLLED IKAELADVWR LTQYVVNQVF KTWGENAVSE DEVGYLTVHF QAAMERQIAR
KRVLLVCSTG IGTSHLLKSR ILRAFPEWTI VDVISAANLS QVLPDNIELI ISTINLPTVT
MPVAYVTAFF NDADIKRVTE MVITEKLHHA TSRVVEI