Gene HS_0004 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagHS_0004 
SymbolpepD 
ID4239511 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameHaemophilus somnus 129PT 
KingdomBacteria 
Replicon accessionNC_008309 
Strand
Start bp3990 
End bp5450 
Gene Length1461 bp 
Protein Length486 aa 
Translation table11 
GC content38% 
IMG OID638103534 
ProductM20C family Xaa-His dipeptidase 
Protein accessionYP_718209 
Protein GI113460153 
COG category[E] Amino acid transport and metabolism 
COG ID[COG2195] Di- and tripeptidases 
TIGRFAM ID[TIGR01893] aminoacyl-histidine dipeptidase 


Plasmid Coverage information

Num covering plasmid clones21 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clonesn/a 
Fosmid unclonability p-valuen/a 
Fosmid Hitchhikern/a 
Fosmid clonabilityn/a 
 

Sequence

Gene sequence
ATGTCAGATC TTCAATCTCT GCAACCTAAA CTACTTTGGC AATGGTTTGA TCAAATTTGT 
GCTATTCCAC ATCCTTCTTA CAAAGAAGAG CAGTTAGCAC AATTCATTAT TAATTGGGCA
AAAACAAAAG GTTTTTTTGC GGAACGTGAT GAAGTCGGTA ATGTATTAAT TCGTAAACCG
GCAACAGTAG GAATGGAAAA TCGTAAACCT GTAGTACTAC AAGCACACTT AGATATGGTT
CCACAAGCTA ATGAAGGGAC AAATCATAAT TTTGATCAGG ATCCTATTTT GCCATATATC
GATGGTGATT GGGTTAAGGC TAAAGGTACA ACGTTAGGTG CTGATAACGG TATTGGTATG
GCATCTGCAC TTGCTGTCTT AGAAAGTAAT GATATTGCAC ATCCGGAGTT GGAAGTATTG
TTAACCATGA CTGAAGAAAG GGGGATGGAG GGGGCGATTG GGCTACGTCC AAATTGGCTT
CGTTCTGAGA TCTTAATTAA TACCGATACT GAAGAAAATG GAGAAATTTA TATAGGTTGT
GCCGGCGGTG AAAATGCAGA TCTTGAATTA CCTATTGAGT ATCAAGTCAA TAATTTCGAA
CATTGTTATC AAGTTGTGCT AAAAGGGTTA CGAGGCGGGC ATTCTGGTGT GGATATTCAT
ACCGGACGAG CTAATGCAAT CAAAGTGTTG CTTCGTTTTT TAGCAGAACT TCAACAAAAC
CAACCGCACT TTGACTTCAC TTTAGCAAAT ATCCGTGGCG GTTCTATTCG CAATGCTATT
CCAAGAGAAA GTGTTGCCAC TTTGGTCTTT AATGGTGATA TTACCGTATT ACAAAGTGCG
GTACAAAAAT TTGCAGATGT AATCAAAGCC GAATTGGCAC TAACTGAGCC AAACTTGATA
TTTACACTTG AAAAAGTTGA AAAACCTCAA CAAGTATTTT CCAGTCAATG CACGAAAAAT
ATTATCCATT GTTTGAATGT TTTGCCAAAT GGCGTAGTAC GTAACAGTGA TGTCATTGAA
AATGTGGTAG AAACATCATT AAGCATTGGC GTGTTAAAAA CTGAAGATAA TTTTGTTAGA
AGTACCATGT TAGTGCGGTC ATTAATTGAA AGTGGCAAAT CCTATGTTGC TTCTTTATTA
AAATCTTTAG CCTCATTAGC ACAAGGTAAT ATCAATTTAT CAGGCGATTA TCCGGGTTGG
GAACCACAAA GTCATAGTGA TATTTTGGAC TTAACTAAGA CAATTTATGC ACAAGTTTTA
GGTACAGATC CTGAAATCAA GGTAATTCAT GCGGGACTTG AATGTGGGTT ATTGAAAAAA
ATCTATCCAA CGATCGATAT GGTATCTATC GGACCGACAA TTAGAAATGC ACATTCTCCG
GATGAAAAAG TACATATTCC GGCTGTGGAA ACTTATTGGA AAGTATTAAC CGGTATACTT
GCTCATATTC CATCACGTTA A
 
Protein sequence
MSDLQSLQPK LLWQWFDQIC AIPHPSYKEE QLAQFIINWA KTKGFFAERD EVGNVLIRKP 
ATVGMENRKP VVLQAHLDMV PQANEGTNHN FDQDPILPYI DGDWVKAKGT TLGADNGIGM
ASALAVLESN DIAHPELEVL LTMTEERGME GAIGLRPNWL RSEILINTDT EENGEIYIGC
AGGENADLEL PIEYQVNNFE HCYQVVLKGL RGGHSGVDIH TGRANAIKVL LRFLAELQQN
QPHFDFTLAN IRGGSIRNAI PRESVATLVF NGDITVLQSA VQKFADVIKA ELALTEPNLI
FTLEKVEKPQ QVFSSQCTKN IIHCLNVLPN GVVRNSDVIE NVVETSLSIG VLKTEDNFVR
STMLVRSLIE SGKSYVASLL KSLASLAQGN INLSGDYPGW EPQSHSDILD LTKTIYAQVL
GTDPEIKVIH AGLECGLLKK IYPTIDMVSI GPTIRNAHSP DEKVHIPAVE TYWKVLTGIL
AHIPSR