While gas chromatography mass spectrometry (GC-MS) has long been used to identify compounds in complex mixtures, this process is often subjective and time-consuming and leaves a large fraction of seemingly good-quality spectra unidentified. In this work, we describe a set of new mass spectral library-based methods to assist compound identification in complex mixtures. These methods employ mass spectral uniqueness and compound ubiquity of library entries alongside noise reduction and automated comparison of retention indices to library compounds.
View Article and Find Full Text PDFThis work presents a detailed determination of site-specific N-glycan distributions of the recombinant influenza glycoproteins hemagglutinin (HA) and neuraminidase. Variation in glycosylation among recombinant glycoproteins is not predictable and can depend on details of the biomanufacturing process as well as details of protein structure. In this study, recombinant influenza proteins were analyzed from eight strains of four different suppliers.
View Article and Find Full Text PDFWe report the comparison of mass-spectral-based abundances of tryptic glycopeptides to fluorescence abundances of released labeled glycans and the effects of mass and charge state and in-source fragmentation on glycopeptide abundances. The primary glycoforms derived from Rituximab, NISTmAb, Evolocumab, and Infliximab were high-mannose and biantennary complex galactosylated and fucosylated N-glycans. Except for Evolocumab, in-source ions derived from the loss of HexNAc or HexNAc-Hex sugars are prominent for other therapeutic IgGs.
View Article and Find Full Text PDFGlycopeptide Abundance Distribution Spectra (GADS) were recently introduced as a means of representing, storing, and comparing glycan profiles of intact glycopeptides. Here, using that representation, an extensive analysis is made of multiple commercial sources of the recombinant SARS-CoV-2 spike protein, each containing 22 N-linked glycan sites (sequons). Multiple proteases are used along with variable energy fragmentation followed by ion trap confirmation.
View Article and Find Full Text PDFWe present a mass spectral library-based method for analyzing site-specific N-linked protein glycosylation. Its operation and utility are illustrated by applying it to both newly measured and available proteomics data of human milk glycoproteins. It generates two varieties of mass spectral libraries.
View Article and Find Full Text PDFA method for representing and comparing distributions of N-linked glycans located at specific sites on proteins is presented. The representation takes the form of a simple mass spectrum for a given peptide sequence, with each peak corresponding to a different glycopeptide. The mass (in place of /) of each peak is that of the glycan mass, and its abundance corresponds to its relative abundance in the electrospray MS spectrum.
View Article and Find Full Text PDFAnnotating product ion peaks in tandem mass spectra is essential for evaluating spectral quality and validating peptide identification. This task is more complex for glycopeptides and is crucial for the confident determination of glycosylation sites in glycoproteins. MS_Piano (ass pectrum eptde ntation) software was developed for reliable annotation of peaks in collision induced dissociation (CID) tandem mass spectra of peptides or N-glycopeptides for given peptide sequences, charge states, and optional modifications.
View Article and Find Full Text PDFRecent reports have demonstrated that genetically variant peptides derived from human hair shaft proteins can be used to differentiate individuals of different biogeographic origins. We report a method involving direct extraction of hair shaft proteins more sensitive than previously published methods regarding GVP detection. It involves one step for protein extraction and was found to provide reproducible results.
View Article and Find Full Text PDFMetabolomics has a critical need for better tools for mass spectral identification. Common metabolites may be identified by searching libraries of tandem mass spectra, which offers important advantages over other approaches to identification. But tandem libraries are not nearly complete enough to represent the full molecular diversity present in complex biological samples.
View Article and Find Full Text PDFSpectral library searching (SLS) is an attractive alternative to sequence database searching (SDS) for peptide identification due to its speed, sensitivity, and ability to include any selected mass spectra. While decoy methods for SLS have been developed for low mass accuracy peptide spectral libraries, it is not clear that they are optimal or directly applicable to high mass accuracy spectra. Therefore, we report the development and validation of methods for high mass accuracy decoy libraries.
View Article and Find Full Text PDFA mass spectral library search algorithm that identifies compounds that differ from library compounds by a single "inert" structural component is described. This algorithm, the Hybrid Similarity Search, generates a similarity score based on matching both fragment ions and neutral losses. It employs the parameter DeltaMass, defined as the mass difference between query and library compounds, to shift neutral loss peaks in the library spectrum to match corresponding neutral loss peaks in the query spectrum.
View Article and Find Full Text PDFWe present a mass spectral library-based method to identify tandem mass spectra of peptides that contain unanticipated modifications and amino acid variants. We describe this as a "hybrid" method because it combines matching both ion m/z and mass losses. The mass loss is the difference between the mass of an ion peak and the mass of its precursor.
View Article and Find Full Text PDFJ Am Soc Mass Spectrom
April 2017
A method to discover and correct errors in mass spectral libraries is described. Comparing across a set of highly curated reference libraries compounds that have the same chemical structure quickly identifies entries that are outliers. In cases where three or more entries for the same compound are compared, the outlier as determined by visual inspection was almost always found to contain the error.
View Article and Find Full Text PDFDerivitization of peptides with isobaric tags such as iTRAQ and TMT is widely employed in proteomics due to their compatibility with multiplex quantitative measurements. We recently made publicly available a large peptide library derived from iTRAQ 4-plex labeled spectra. This resource has not been used for identifying peptides labeled with related tags with different masses, because values for virtually all masses of precursor and most product ions would differ for ions containing the different tags as well as containing different tag-specific peaks.
View Article and Find Full Text PDFThe Clinical Proteomic Tumor Analysis Consortium (CPTAC) has produced large proteomics data sets from the mass spectrometric interrogation of tumor samples previously analyzed by The Cancer Genome Atlas (TCGA) program. The availability of the genomic and proteomic data is enabling proteogenomic study for both reference (i.e.
View Article and Find Full Text PDFThis paper documents the design, layout and algorithms of the IUPAC International Chemical Identifier, InChI.
View Article and Find Full Text PDFAnal Chem
December 2013
Recent progress in metabolomics and the development of increasingly sensitive analytical techniques have renewed interest in global profiling, i.e., semiquantitative monitoring of all chemical constituents of biological fluids.
View Article and Find Full Text PDFSince its public introduction in 2005 the IUPAC InChI chemical structure identifier standard has become the international, worldwide standard for defined chemical structures. This article will describe the extensive use and dissemination of the InChI and InChIKey structure representations by and for the world-wide chemistry community, the chemical information community, and major publishers and disseminators of chemical and related scientific offerings in manuscripts and databases.
View Article and Find Full Text PDFInChIKey is a 27-character compacted (hashed) version of InChI which is intended for Internet and database searching/indexing and is based on an SHA-256 hash of the InChI character string. The first block of InChIKey encodes molecular skeleton while the second block represents various kinds of isomerism (stereo, tautomeric, etc.).
View Article and Find Full Text PDFStandard Reference Materials (SRMs) offer the scientific community a stable and homogenous source of material that holds countless application possibilities. Traditionally, the National Institute of Standards and Technology (NIST) has provided SRMs with associated quantitative information (certified values) for a select group of targeted analytes as measured in a solution or complex matrix. While the current needs of the SRM community are expanding to include non-quantitative data, NIST is attempting to broaden the scope of how and what information is offered to the SRM community by providing qualitative information about biomaterials, such as chromatographic fingerprints and profiles of untargeted identifications.
View Article and Find Full Text PDFA major unmet need in LC-MS/MS-based proteomics analyses is a set of tools for quantitative assessment of system performance and evaluation of technical variability. Here we describe 46 system performance metrics for monitoring chromatographic performance, electrospray source stability, MS1 and MS2 signals, dynamic sampling of ions for MS/MS, and peptide identification. Applied to data sets from replicate LC-MS/MS analyses, these metrics displayed consistent, reasonable responses to controlled perturbations.
View Article and Find Full Text PDF