Gene EcSMS35_0033 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagEcSMS35_0033 
SymbolcarB 
ID6142597 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameEscherichia coli SMS-3-5 
KingdomBacteria 
Replicon accessionNC_010498 
Strand
Start bp35964 
End bp39185 
Gene Length3222 bp 
Protein Length1073 aa 
Translation table11 
GC content57% 
IMG OID641614934 
Productcarbamoyl phosphate synthase large subunit 
Protein accessionYP_001742150 
Protein GI170680299 
COG category[F] Nucleotide transport and metabolism
[I] Lipid transport and metabolism
[E] Amino acid transport and metabolism 
COG ID[COG0439] Biotin carboxylase
[COG0458] Carbamoylphosphate synthase large subunit (split gene in MJ) 
TIGRFAM ID[TIGR01369] carbamoyl-phosphate synthase, large subunit 


Plasmid Coverage information

Num covering plasmid clones14 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones53 
Fosmid unclonability p-value
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGCCAAAAC GTACAGATAT AAAAAGTATC CTGATTCTGG GTGCGGGCCC GATTGTTATC 
GGTCAGGCGT GTGAGTTTGA CTACTCTGGC GCGCAAGCGT GTAAAGCCCT GCGTGAAGAG
GGCTACCGCG TCATTCTGGT GAACTCCAAC CCGGCGACCA TCATGACCGA CCCGGAAATG
GCTGATGCAA CCTACATCGA GCCGATTCAC TGGGAAGTGG TACGCAAAAT TATTGAAAAA
GAGCGCCCGG ACGCGGTGCT GCCAACCATG GGGGGGCAGA CGGCGCTGAA CTGTGCGCTG
GAGCTTGAGC GTCAGGGCGT GCTGGAAGAG TTCGGCGTCA CCATGATTGG TGCCACTGCC
GATGCGATTG ATAAAGCCGA AGACCGCCGC CGTTTCGACG TAGCGATGAA GAAAATTGGT
CTGGAAACCG CGCGTTCCGG TATCGCACAC ACGATGGAAG AAGCGCTAGC GGTTGCCGCT
GAAGTTGGCT TCCCGTGCAT TATTCGCCCA TCCTTTACCA TGGGCGGTAG CGGTGGCGGT
ATCGCTTATA ACCGTGAAGA GTTTGAAGAA ATTTGCGCCC GCGGTCTGGA TCTCTCTCCG
ACCAAAGAGT TGCTGATTGA CGAGTCGCTG ATCGGCTGGA AAGAGTACGA GATGGAAGTG
GTGCGTGATA AAAACGACAA CTGCATCATC GTCTGCTCTA TCGAAAACTT TGATGCGATG
GGCATCCACA CCGGTGACTC CATCACTGTC GCGCCAGCCC AAACGCTGAC CGACAAAGAA
TATCAAATCA TGCGTAACGC CTCGATGGCG GTGCTGCGTG AAATCGGCGT TGAAACCGGT
GGTTCTAACG TCCAGTTTGC GGTGAACCCG AAAAACGGTC GCCTGATTGT TATCGAAATG
AACCCGCGCG TGTCGCGTTC TTCGGCGCTG GCGTCGAAAG CCACTGGTTT CCCGATTGCT
AAAGTGGCGG CGAAACTGGC GGTGGGTTAC ACCCTCGACG AACTGATGAA CGACATCACC
GGCGGGCGTA CTCCGGCCTC CTTCGAGCCG TCCATCGACT ACGTGGTCAC CAAAATTCCT
CGCTTCAACT TCGAGAAATT CGCCGGGGCG AACGACCGTC TGACCACTCA GATGAAATCG
GTTGGCGAAG TGATGGCGAT TGGGCGCACG CAGCAGGAAT CCCTGCAAAA AGCGCTGCGC
GGCCTGGAAG TGGGCGCAAC TGGCTTCGAC CCGAAAGTCA GCCTCGATGA TCCGGAAGCG
CTGACCAAAA TTCGCCGCGA ACTGAAAGAT GCAGGCGCAG AGCGTATCTG GTATATCGCC
GATGCTTTCC GCGCGGGCCT GTCCGTAGAC GGCGTGTTCA ACCTGACCAA CATTGACCGC
TGGTTCCTGG TACAAATTGA AGAGCTGGTG CGTCTGGAAG AGAAAGTCGC AGAAGTGGGC
ATCACTGGCC TGAACGCTGA CTTCCTGCGC CAGCTGAAAC GCAAAGGCTT TGCCGACGCG
CGTCTGGCAA AACTCGCGGG CGTGCGTGAA GCAGAAATCC GCAAGCTGCG CGACCAGTAT
GACCTGCACC CGGTTTATAA GCGCGTGGAT ACCTGTGCGG CAGAGTTCGC CACCGACACC
GCTTACATGT ACTCCACTTA TGAAGAAGAG TGCGAAGCGA ATCCGTCTAC CGACCGTGAA
AAAATCATGG TGCTCGGCGG CGGCCCGAAC CGTATCGGTC AGGGTATCGA ATTCGACTAC
TGCTGCGTAC ACGCCTCGCT GGCGCTGCGC GAAGACGGTT ACGAAACCAT TATGGTTAAC
TGTAACCCGG AAACCGTCTC TACCGACTAC GACACTTCCG ATCGCCTCTA CTTCGAGCCG
GTAACTCTGG AAGACGTGCT GGAAATTGTG CGTATTGAGA AGCCGAAAGG CGTTATCGTC
CAGTACGGCG GTCAGACCCC GCTGAAACTG GCGCGCGCGC TGGAAGCGGC TGGCGTACCG
GTTATCGGTA CCAGCCCGGA TGCCATCGAC CGTGCGGAAG ACCGTGAACG CTTCCAGCAT
GCGGTTGAGC GTCTGAAACT GAAACAACCG GCGAACGCCA CCGTTACCAC CATCGAAATG
GCGGTTGAGA AAGCGAAAGA GATTGGCTAC CCGCTGGTGG TGCGTCCGTC TTACGTTCTC
GGCGGTCGGG CGATGGAAAT CGTCTATGAC GAAGCTGACC TGCGTCGCTA CTTCCAGACA
GCGGTCAGCG TATCTAACGA TGCGCCAGTG TTGCTGGACC ACTTCCTTGA TGACGCAGTA
GAAGTTGACG TGGATGCCAT CTGCGACGGC GAAATGGTGC TGATTGGCGG CATCATGGAG
CATATTGAGC AGGCGGGCGT GCACTCCGGT GACTCCGCAT GTTCTCTGCC AGCCTACACC
TTAAGTCAGG AAATTCAGGA TGTGATGCGC CAGCAGGTGC AGAAACTGGC CTTCGAATTG
CAGGTGCGCG GCCTGATGAA CGTGCAGTTT GCGGTGAAAA ACAACGAAGT CTACCTGATT
GAAGTTAACC CGCGTGCGGC GCGTACCGTT CCGTTCGTCT CCAAAGCCAC CGGCGTACCG
CTGGCGAAAG TGGCGGCGCG CGTGATGGCA GGCAAAACGC TGGCCGAGCA GGGCGTAACC
AAAGAAGTTA TCCCGCCGTA CTACTCGGTG AAAGAAGTGG TGCTGCCGTT CAATAAATTC
CCGGGCGTTG ACCCGCTGTT AGGGCCAGAA ATGCGCTCTA CCGGGGAAGT CATGGGCGTG
GGCCGCACCT TCGCTGAAGC GTTTGCCAAA GCGCAGCTGG GCAGCAACTC CACCATGAAG
AAACACGGTC GTGCGCTGCT TTCCGTGCGC GAAGGCGACA AAGAACGCGT GGTGGACCTG
GCGGCAAAAC TGCTGAAACA GGGCTTCGAG CTGGATGCGA CCCACGGCAC GGCGATTGTG
CTGGGCGAAG CGGGTATCAA TCCGCGTCTG GTAAACAAGG TGCATGAGGG CCGTCCGCAC
ATTCAGGACC GCATCAAGAA TGGCGAATAT ACCTACATCA TCAACACCAC CTCAGGCCGT
CGTGCGATTG AAGACTCCCG CGTGATCCGT CGCAGTGCGC TGCAATATAA AGTGCATTAT
GACACCACCC TGAACGGTGG TTTCGCTACC GCGATGGCGC TGAATGCCGA TGCGACTGAA
AAAGTAATTT CGGTGCAGGA AATGCACGCG CAGATCAAAT AA
 
Protein sequence
MPKRTDIKSI LILGAGPIVI GQACEFDYSG AQACKALREE GYRVILVNSN PATIMTDPEM 
ADATYIEPIH WEVVRKIIEK ERPDAVLPTM GGQTALNCAL ELERQGVLEE FGVTMIGATA
DAIDKAEDRR RFDVAMKKIG LETARSGIAH TMEEALAVAA EVGFPCIIRP SFTMGGSGGG
IAYNREEFEE ICARGLDLSP TKELLIDESL IGWKEYEMEV VRDKNDNCII VCSIENFDAM
GIHTGDSITV APAQTLTDKE YQIMRNASMA VLREIGVETG GSNVQFAVNP KNGRLIVIEM
NPRVSRSSAL ASKATGFPIA KVAAKLAVGY TLDELMNDIT GGRTPASFEP SIDYVVTKIP
RFNFEKFAGA NDRLTTQMKS VGEVMAIGRT QQESLQKALR GLEVGATGFD PKVSLDDPEA
LTKIRRELKD AGAERIWYIA DAFRAGLSVD GVFNLTNIDR WFLVQIEELV RLEEKVAEVG
ITGLNADFLR QLKRKGFADA RLAKLAGVRE AEIRKLRDQY DLHPVYKRVD TCAAEFATDT
AYMYSTYEEE CEANPSTDRE KIMVLGGGPN RIGQGIEFDY CCVHASLALR EDGYETIMVN
CNPETVSTDY DTSDRLYFEP VTLEDVLEIV RIEKPKGVIV QYGGQTPLKL ARALEAAGVP
VIGTSPDAID RAEDRERFQH AVERLKLKQP ANATVTTIEM AVEKAKEIGY PLVVRPSYVL
GGRAMEIVYD EADLRRYFQT AVSVSNDAPV LLDHFLDDAV EVDVDAICDG EMVLIGGIME
HIEQAGVHSG DSACSLPAYT LSQEIQDVMR QQVQKLAFEL QVRGLMNVQF AVKNNEVYLI
EVNPRAARTV PFVSKATGVP LAKVAARVMA GKTLAEQGVT KEVIPPYYSV KEVVLPFNKF
PGVDPLLGPE MRSTGEVMGV GRTFAEAFAK AQLGSNSTMK KHGRALLSVR EGDKERVVDL
AAKLLKQGFE LDATHGTAIV LGEAGINPRL VNKVHEGRPH IQDRIKNGEY TYIINTTSGR
RAIEDSRVIR RSALQYKVHY DTTLNGGFAT AMALNADATE KVISVQEMHA QIK