In this Letter, a two-term formalism for constructing protein solubility curves in thermal proteome profiling (TPP) is considered, which takes into account the efficiency of the drug-protein binding reaction. When the reaction is incomplete, this results in distortion of the otherwise sigmoidal shape of the curve after drug treatment, which is often observed in experiments. This distortion may be significant enough to disqualify the corresponding protein from the list of drug target candidates, thus negatively affecting the results of TPP data analysis.
View Article and Find Full Text PDFData-independent acquisition (DIA) at the shortened data acquisition time is becoming a method of choice for quantitative proteomic applications requiring high throughput analysis of large cohorts of samples. With the advent of the combination of high resolution mass spectrometry with an asymmetric track lossless analyzer, these DIA capabilities were further extended with the recent demonstration of quantitative analyses at the speed of up to hundreds of samples per day. In particular, the proteomic data for the brain samples related to multiple system atrophy disease were acquired using 7 and 28 min chromatography gradients (Guzman et al.
View Article and Find Full Text PDFAdvances in high-throughput high-resolution mass spectrometry and the development of thermal proteome profiling approach (TPP) have made it possible to accelerate a drug target search. Since its introduction in 2014, TPP quickly became a method of choice in chemical proteomics for identifying drug-to-protein interactions on a proteome-wide scale and mapping the pathways of these interactions, thus further elucidating the unknown mechanisms of action of a drug under study. However, the current TPP implementations based on tandem mass spectrometry (MS/MS), associated with employing lengthy peptide separation protocols and expensive labeling techniques for sample multiplexing, limit the scaling of this approach for the ever growing variety of drug-to-proteomes.
View Article and Find Full Text PDFOne of the key steps in data dependent acquisition (DDA) proteomics is detection of peptide isotopic clusters, also called "features", in MS1 spectra and matching them to MS/MS-based peptide identifications. A number of peptide feature detection tools became available in recent years, each relying on its own matching algorithm. Here, we provide an integrated solution, the intensity-based Quantitative Mix and Match Approach (IQMMA), which integrates a number of untargeted peptide feature detection algorithms and returns the most probable intensity values for the MS/MS-based identifications.
View Article and Find Full Text PDFThe coefficient of variation (CV) is often used in proteomics as a proxy to characterize the performance of a quantitation method and/or the related software. In this note, we question the excessive reliance on this metric in quantitative proteomics that may result in erroneous conclusions. We support this note using a ground-truth Human-Yeast-E.
View Article and Find Full Text PDFCurrent proteomics approaches rely almost exclusively on using the positive ionization mode, resulting in inefficient ionization of many acidic peptides. This study investigates protein identification efficiency in the negative ionization mode using the DirectMS1 method. DirectMS1 is an ultrafast data acquisition method based on accurate peptide mass measurements and predicted retention times.
View Article and Find Full Text PDFThe proteogenomic search pipeline developed in this work has been applied for reanalysis of 40 publicly available shotgun proteomic datasets from various human tissues comprising more than 8000 individual LC-MS/MS runs, of which 5442 .raw data files were processed in total. This reanalysis was focused on searching for ADAR-mediated RNA editing events, their clustering across samples of different origins, and classification.
View Article and Find Full Text PDFProtein quantitation in tissue cells or physiological fluids based on liquid chromatography/mass spectrometry is one of the key sources of information on the mechanisms of cell functioning during chemotherapeutic treatment. Information on significant changes in protein expression upon treatment can be obtained by chemical proteomics and requires analysis of the cellular proteomes, as well as development of experimental and bioinformatic methods for identification of the drug targets. Low throughput of whole proteome analysis based on liquid chromatography and tandem mass spectrometry is one of the main factors limiting the scale of these studies.
View Article and Find Full Text PDFRecently, we presented the DirectMS1 method of ultrafast proteome-wide analysis based on minute-long LC gradients and MS1-only mass spectra acquisition. Currently, the method provides the depth of human cell proteome coverage of 2500 proteins at a 1% false discovery rate (FDR) when using 5 min LC gradients and 7.3 min runtime in total.
View Article and Find Full Text PDFInt J Mol Sci
May 2022
Cancer cell lines responded differentially to type I interferon treatment in models of oncolytic therapy using vesicular stomatitis virus (VSV). Two opposite cases were considered in this study, glioblastoma DBTRG-05MG and osteosarcoma HOS cell lines exhibiting resistance and sensitivity to VSV after the treatment, respectively. Type I interferon responses were compared for these cell lines by integrative analysis of the transcriptome, proteome, and RNA editome to identify molecular factors determining differential effects observed.
View Article and Find Full Text PDFSpectrum clustering is a powerful strategy to minimize redundant mass spectra by grouping them based on similarity, with the aim of forming groups of mass spectra from the same repeatedly measured analytes. Each such group of near-identical spectra can be represented by its so-called consensus spectrum for downstream processing. Although several algorithms for spectrum clustering have been adequately benchmarked and tested, the influence of the consensus spectrum generation step is rarely evaluated.
View Article and Find Full Text PDFMass spectrometry-based proteome analysis implies matching the mass spectra of proteolytic peptides to amino acid sequences predicted from genomic sequences. Reliability of peptide variant identification in proteogenomic studies is often lacking. We propose a way to interpret shotgun proteomics results, specifically in the data-dependent acquisition mode, as protein sequence coverage by multiple reads as it is done in nucleic acid sequencing for calling of single nucleotide variants.
View Article and Find Full Text PDFProtein inference is one of the crucial steps in proteome characterization using a bottom-up approach. Multiple algorithms to solve the problem are focused on extensive analysis of shared peptides identified from fragmentation mass spectra (MS/MS). However, many protein homologues with a similar amino acid sequence typically have identical lists of identified peptides due to the problem of proteome undersampling in a bottom-up approach and, thus, cannot be distinguished by existing protein inference methods.
View Article and Find Full Text PDFProteome-wide analyses rely on tandem mass spectrometry and the extensive separation of proteolytic mixtures. This imposes considerable instrumental time consumption, which is one of the main obstacles in the broader acceptance of proteomics in biomedical and clinical research. Recently, we presented a fast proteomic method termed DirectMS1 based on ultrashort LC gradients as well as MS1-only mass spectra acquisition and data processing.
View Article and Find Full Text PDFRationale: One of the important steps in initial data processing of peptide mass spectra is the detection of peptide features in full-range mass spectra. Ion mobility offers advantages over previous methods performing this detection by providing an additional structure-specific separation dimension. However, there is a lack of open-source software that utilizes these advantages and detects peptide features in mass spectra acquired along with ion mobility data using new instruments such as timsTOF and/or FAIMS-Orbitrap.
View Article and Find Full Text PDFAdenosine-to-inosine RNA editing is an enzymatic post-transcriptional modification which modulates immunity and neural transmission in multicellular organisms. In particular, it involves editing of mRNA codons with the resulting amino acid substitutions. We identified such sites for developmental proteomes of at the protein level using available data for 15 stages of fruit fly development from egg to imago and 14 time points of embryogenesis.
View Article and Find Full Text PDFProteome characterization relies heavily on tandem mass spectrometry (MS/MS) and is thus associated with instrumentation complexity, lengthy analysis time, and limited duty cycle. It was always tempting to implement approaches that do not require MS/MS, yet they were constantly failing to achieve a meaningful depth of quantitative proteome coverage within short experimental times, which is particularly important for clinical or biomarker-discovery applications. Here, we report on the first successful attempt to develop a truly MS/MS-free method, DirectMS1, for bottom-up proteomics.
View Article and Find Full Text PDFProteogenomics is based on the use of customized genome or RNA sequencing databases for interrogation of shotgun proteomics data in search for proteome-level evidence of genome variations or RNA editing. In this work, the products of adenosine-to-inosine RNA editing in human and murine brain proteomes are identified using publicly available brain proteome LC-MS/MS datasets and an RNA editome database compiled from several sources. After filtering of false-positive results, 20 and 37 sites of editing in proteins belonging to 14 and 32 genes are identified for murine and human brain proteomes, respectively.
View Article and Find Full Text PDFMany of the novel ideas that drive today's proteomic technologies are focused essentially on experimental or data-processing workflows. The latter are implemented and published in a number of ways, from custom scripts and programs, to projects built using general-purpose or specialized workflow engines; a large part of routine data processing is performed manually or with custom scripts that remain unpublished. Facilitating the development of reproducible data-processing workflows becomes essential for increasing the efficiency of proteomic research.
View Article and Find Full Text PDFShotgun proteomics workflows for database protein identification typically include a combination of search engines and postsearch validation software based mostly on machine learning algorithms. Here, a new postsearch validation tool called Scavager employing CatBoost, an open-source gradient boosting library, which shows improved efficiency compared with the other popular algorithms, such as Percolator, PeptideProphet, and Q-ranker, is presented. The comparison is done using multiple data sets and search engines, including MSGF+, MSFragger, X!Tandem, Comet, and recently introduced IdentiPy.
View Article and Find Full Text PDFThe efficiency of proteome analysis depends strongly on the configuration parameters of the search engine. One of the murkiest and nontrivial among them is the list of amino acid modifications included for the search. Here, an approach called AA_stat is presented for uncovering the unexpected modifications of amino acid residues in the protein sequences, as well as possible artifacts of data acquisition or processing, in the results of proteome analyses.
View Article and Find Full Text PDFAdenosine-to-inosine RNA editing is one of the most common types of RNA editing, a posttranscriptional modification made by special enzymes. We present a proteomic study on this phenomenon for Drosophila melanogaster. Three proteome data sets were used in the study: two taken from public repository and the third one obtained here.
View Article and Find Full Text PDFWe present an open-source, extensible search engine for shotgun proteomics. Implemented in Python programming language, IdentiPy shows competitive processing speed and sensitivity compared with the state-of-the-art search engines. It is equipped with a user-friendly web interface, IdentiPy Server, enabling the use of a single server installation accessed from multiple workstations.
View Article and Find Full Text PDFThe identification of genetically encoded variants at the proteome level is an important problem in cancer proteogenomics. The generation of customized protein databases from DNA or RNA sequencing data is a crucial stage of the identification workflow. Genomic data filtering applied at this stage may significantly modify variant search results, yet its effect is generally left out of the scope of proteogenomic studies.
View Article and Find Full Text PDFAn acquisition of increased sensitivity of cancer cells to viruses is a common outcome of malignant progression that justifies the development of oncolytic viruses as anticancer therapeutics. Studying molecular changes that underlie the sensitivity to viruses would help to identify cases where oncolytic virus therapy would be most effective. We quantified changes in protein abundances in two glioblastoma multiforme (GBM) cell lines that differ in the ability to induce resistance to vesicular stomatitis virus (VSV) infection in response to type I interferon (IFN) treatment.
View Article and Find Full Text PDF