Gene EcSMS35_4023 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagEcSMS35_4023 
Symbol 
ID6147192 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameEscherichia coli SMS-3-5 
KingdomBacteria 
Replicon accessionNC_010498 
Strand
Start bp4102441 
End bp4104138 
Gene Length1698 bp 
Protein Length565 aa 
Translation table11 
GC content36% 
IMG OID641618848 
Productinvasion protein regulator 
Protein accessionYP_001745986 
Protein GI170680759 
COG category[K] Transcription 
COG ID[COG3710] DNA-binding winged-HTH domains 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clones25 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones48 
Fosmid unclonability p-value0.876946 
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGCAACTGC AACATGGAAA AAATAATACA CAACAATATA TATTTGGCGA CTTTGTATTA 
AAAAACAATG GAATATTATT ATTCAAAAAC AAAGAATATC ATATTCCCCC GAAAGAACTT
GGCGTTATCA TTCTATTACT TAATGCTGAC GGTGAAATTG TTAGCAAAGA AGAGATTATC
GATAAGGTCT GGAGCGCAAG TGTTGCCAGT GATGAATCTC TAACACGATG TATTTATGCA
TTGCGCAAAC TTTTGCATGA GAATAAACAG TGTAAGTATA TCGAAACGGT CTATGGTCGT
GGATACCGTT TTACTGTACC TATTGTGATT GTGACCGATA ATGAACCGGT AAAGTCGACC
ACCACTCTTG CCGTTTTTCC CTTTCGTACT GAAGGTTCGA TCAATGTTCT AAAACTCCAT
TACGAACTTG TCCAGGGCCT TTCAAAATAT GCTTTTTGTG GTTTAGATAT TTTACCCGCT
TCGGTAACAA ATGAGGCCAG TGATTTCTCC ACCATCCATC AGTTTATCAA TAAAACAGGT
CCAGAATATT ATATTATGGG TCAGGTCGTT CACTATGGTA AAAACTGGCG TTTATTTATT
GAGTTAATCC ATGCCAGAAC GCATAAATTG ATTGATCATC AAAGTATTGA TTTCAACCCC
GATAATCCTC TTTCCGTATT ATTATCACAA GTAATTAATA TTCTTATTGA AAAAATACCC
AACATTAATT TTAAATCTAT TAATACACAA CAAATGCCAT CTCTGGATTC TGCTGTTATG
TATATGAATG GCAGGATGGA AATGTACCGT TATACACCAG AAAGTTTGAA ACGAGCATTA
GTAATATTTA GAGATTGTGT GAGTACGCAA CCGCAAAATA CCTTGCCCTA CTGTTGCCTG
GCGGAATGCC ATATATCATT AGCACTCCTT GGAATCAGTG AACAAAAACA AGCAATAACT
ACGGCAGTAA CCTCCATTGA AAAGGCTCTG GAAATTAACC CTTCAAATTC TCAGGCATTG
GGTTTACTGG GTTTAATCAG TGGGCTAAAA GATGAACATT CAGTGTCTAA TGTGCTATTT
AAGCAAGCAC ATTTACTCAA ACCTAATTCG CCAGACGTTT ATTATTATCA GTCGCTGCTA
TATTTTTTGA ACGGCGACCT TGCAAGAGCA TTTAATTTAA TAGAAAAAAG CATTGCTCTT
GAACCCAATA AAATGGGCAT AAGCATTCTT AAATTAATTA TACTTTATTA CACATCCCCC
CTCGATAACG CGATATCTTT TGCATTAAAT CTTAACAGTC AAAACACCTG TAATAATCCA
ATTATTACCT CTATTCTGGC TATGTTCATG GCACTGAAAG GACATAATGA CAAAGCCAAA
AGTTTATTGC TTAAACTTGA GCCTGAACAT GGTCTTGATT ACACCTGCGT TAACTCGCTT
TATACGAAGT TTCTGATCTA TGGCGCATCT ATAAAAAATG ACATTATGAA GTTACTGGCA
AATATTAACA CTAACAAAAT CAATGGTGTG ATTTTGCCGT TAATTTACAC TGTTTATGGT
AAGAAAGAAT ATGAAAAGCG CTGGCAACAA TTAATTAAAG AGAATGATTT ATGGTCAAAT
GTTTTGCTTC ATGACCCCCG TTTGCTCAGT GTAAAAAATG AATTTAATAC TATTGGTGTG
ATGCGTACTT CTGCGTAA
 
Protein sequence
MQLQHGKNNT QQYIFGDFVL KNNGILLFKN KEYHIPPKEL GVIILLLNAD GEIVSKEEII 
DKVWSASVAS DESLTRCIYA LRKLLHENKQ CKYIETVYGR GYRFTVPIVI VTDNEPVKST
TTLAVFPFRT EGSINVLKLH YELVQGLSKY AFCGLDILPA SVTNEASDFS TIHQFINKTG
PEYYIMGQVV HYGKNWRLFI ELIHARTHKL IDHQSIDFNP DNPLSVLLSQ VINILIEKIP
NINFKSINTQ QMPSLDSAVM YMNGRMEMYR YTPESLKRAL VIFRDCVSTQ PQNTLPYCCL
AECHISLALL GISEQKQAIT TAVTSIEKAL EINPSNSQAL GLLGLISGLK DEHSVSNVLF
KQAHLLKPNS PDVYYYQSLL YFLNGDLARA FNLIEKSIAL EPNKMGISIL KLIILYYTSP
LDNAISFALN LNSQNTCNNP IITSILAMFM ALKGHNDKAK SLLLKLEPEH GLDYTCVNSL
YTKFLIYGAS IKNDIMKLLA NINTNKINGV ILPLIYTVYG KKEYEKRWQQ LIKENDLWSN
VLLHDPRLLS VKNEFNTIGV MRTSA