Publications by authors named "Mark Shifman"

Access to deep-seated brain lesions (e.g., tumors, aneurysms, hematomas, and other malformations) is challenging due to the potential for retraction-induced injury.

View Article and Find Full Text PDF

This paper describes a natural language processing (NLP)-based clinical decision support (CDS) system that is geared towards colon cancer care coordinators as the end users. The system is implemented using a metadata- driven Structured Query Language (SQL) function (discriminant function). For our pilot study, we have developed a training corpus consisting of 2,085 pathology reports from the VA Connecticut Health Care System (VACHS).

View Article and Find Full Text PDF

We report a significantly-enhanced bioinformatics suite and database for proteomics research called Yale Protein Expression Database (YPED) that is used by investigators at more than 300 institutions worldwide. YPED meets the data management, archival, and analysis needs of a high-throughput mass spectrometry-based proteomics research ranging from a single laboratory, group of laboratories within and beyond an institution, to the entire proteomics community. The current version is a significant improvement over the first version in that it contains new modules for liquid chromatography-tandem mass spectrometry (LC-MS/MS) database search results, label and label-free quantitative proteomic analysis, and several scoring outputs for phosphopeptide site localization.

View Article and Find Full Text PDF

We present a comprehensive workflow for large scale (>1000 transitions/run) label-free LC-MRM proteome assays. Innovations include automated MRM transition selection, intelligent retention time scheduling that improves S/N by twofold, and automatic peak modeling. Improvements to data analysis include a novel Q/C metric, normalized group area ratio, MLR normalization, weighted regression analysis, and data dissemination through the Yale protein expression database.

View Article and Find Full Text PDF

The genetic code can be manipulated to reassign codons for the incorporation of non-standard amino acids (NSAA). Deletion of release factor 1 in Escherichia coli enhances translation of UAG (Stop) codons, yet may also extended protein synthesis at natural UAG terminated messenger RNAs. The fidelity of protein synthesis at reassigned UAG codons and the purity of the NSAA containing proteins produced require careful examination.

View Article and Find Full Text PDF

The widespread use of mass spectrometry for protein identification has created a demand for computationally efficient methods of matching mass spectrometry data to protein databases. A search using X!Tandem, a popular and representative program, can require hours or days to complete, particularly when missed cleavages and post-translational modifications are considered. Existing techniques for accelerating X!Tandem by employing parallelism are unsatisfactory for a variety of reasons.

View Article and Find Full Text PDF

We have developed an integrated web-accessible software system called the Yale Protein Expression Database (YPED) to address the need for storage, retrieval, and integrated analysis of large amounts of data from high throughput proteomic technologies. YPED is an open source system which integrates gel analysis results with protein identifications from DIGE experiments. The system associates the DIGE gel spots and image, analyzed with DeCyder, with mass spectrometric protein identifications from selected gel spots.

View Article and Find Full Text PDF

The mass defect, that is, the difference between the nominal and actual monoisotopic masses, of a phosphorus in a phosphate group is greater than for most other atoms present in proteins. When the mass defects of tryptic peptides derived from the human proteome are plotted against their masses, phosphopeptides tend to fall off the regression line. By calculating the masses of all potential tryptic peptides from the human proteome, we show that regions of higher phosphorylation probability exist on such a plot.

View Article and Find Full Text PDF

We have developed the Yale Protein Expression Database (YPED) to address the storage, retrieval, and integrated analysis of proteomics data generated by Yale's Keck Protein Chemistry and Mass Spectrometry Facility. YPED is Web-accessible and currently handles sample requisition, result reporting and sample comparison for ICAT, DIGE and MUDPIT samples. Sample descriptions are compatible with the evolving MIAPE standards.

View Article and Find Full Text PDF

This report describes XDesc (eXperiment Description), a pilot project that serves as a case study exploring the degree to which an informatics capability developed in a clinical application can be ported for use in the biosciences. In particular, XDesc uses the Entity-Attribute-Value database implementation (including a great deal of metadata-based functionality) developed in TrialDB, a clinical research database, for use in describing the samples used in microarray experiments stored in the Yale Microarray Database (YMD). XDesc was linked successfully to both TrialDB and YMD, and was used to describe the data in three different microarray research projects involving Drosophila.

View Article and Find Full Text PDF

We devised an algorithm for integrating similar clinical research data collection instruments to create a common measurement instrument. We tested this algorithm using questions from several similar surveys. We encountered differing levels of granularity among questions and responses across surveys resulting in either the loss of granularity or data.

View Article and Find Full Text PDF

This case study describes a project that explores issues of quality of service (QoS) relevant to the next-generation Internet (NGI), using the PathMaster application in a testbed environment. PathMaster is a prototype computer system that analyzes digitized cell images from cytology specimens and compares those images against an image database, returning a ranked set of "similar" cell images from the database. To perform NGI testbed evaluations, we used a cluster of nine parallel computation workstations configured as three subclusters using Cisco routers.

View Article and Find Full Text PDF