We are interested in developing and applying new technologies in the fields of mass spectrometry and proteomics. The impressive amount of data generated by the genomics revolution is being organized and made accessible in a variety of databases and libraries. These include genomic and expressed sequence tag databases, transcriptome maps, and protein databases that describe the identity of some of the proteins expressed by a tissue or cell, as well as other relevant properties including their structure, function and macromolecular interactions. Many of these databases describe the situation encountered at the time of the measurements in a static manner. However, many biological processes are dynamic responses to extraneous perturbations, be they environmental, pharmacological, pathological, genetic or otherwise. The ability to detect accurately and to quantify all of the changes included by a specific perturbation is therefore an essential part of the study of dynamic biological processes. At the heart of all aspects of our lab is protein sequencing by mass spectrometry. Simplified greatly, a tandem mass spectrometer can "sequence" a peptide ion by first measuring the mass of the peptide and then selectively isolating and gently fragmenting that peptide at peptide bonds followed by mass measurement of the fragment ions. The resulting tandem mass spectrum contains the sequence information for a single peptide. The astounding power of the technique can be understood when one compares traditional peptide sequencing by Edman degradation with peptide sequencing by mass spectrometry. A decapeptide can be sequenced by Edman degradation in about 12 hours. That same peptide can be sequence by a tandem mass spectrometer in about 1 second at 10 to 100 times the sensitivity.
"Search algorithms like Sequest or Mascot often successfully identify the proper peptide sequence, but fail to provide information about the presence or absence of site-determining ions. As a result, users must manually inspect each spectrum to confirm proper site localization. Here, we present a probability-based score, named the Ascore, which measures the probability of correct phosphorylation site localization based on the presence and intensity of site-determining ions in MS/MS spectra"
Evin ( Evolution Index) Calculates a residue specific conservation score for peptides with post-translational modifications.
"Motif-x (short for motif extractor) is a software tool designed to extract over-represented patterns from any sequence data set. The algorithm is an iterative strategy which builds successive motifs through comparison to a dynamic statistical background. "
"The proper way of reversing your database. PHP script that takes a fasta formated file as an argument and creates a reverse database out of it."
"scan-x is a software tool designed to find motifs (identified using motif-x) within any sequence data set. The first large scale scan was performed using all available human, mouse, fly and yeast phosphorylation and acetylation data to perform a scan for undiscovered sites. "