Publications by authors named "Omenn G"

Background: Patient notes contain substantial information but are difficult for computers to analyse due to their unstructured format. Large-language models (LLMs), such as Generative Pre-trained Transformer 4 (GPT-4), have changed our ability to process text, but we do not know how effectively they handle medical notes. We aimed to assess the ability of GPT-4 to answer predefined questions after reading medical notes in three different languages.

View Article and Find Full Text PDF

The human body contains trillions of cells, classified into specific cell types, with diverse morphologies and functions. In addition, cells of the same type can assume different states within an individual's body during their lifetime. Understanding the complexities of the proteome in the context of a human organism and its many potential states is a necessary requirement to understanding human biology, but these complexities can neither be predicted from the genome, nor have they been systematically measurable with available technologies.

View Article and Find Full Text PDF
Article Synopsis
  • The Human Proteome Project (HPP) aims to identify every protein-coding gene’s isoform and integrate proteomics into studies of human health and disease.
  • Major updates include the retirement of neXtProt as the knowledge base, with UniProtKB now serving as the reference proteome, and GENCODE providing the target protein list.
  • Recent data shows that 93% of protein-coding genes have been expressed, leaving 1,273 non-expressed proteins, along with the introduction of a new scoring system for functional annotation of proteins.
View Article and Find Full Text PDF

Over the past few years, artificial intelligence (AI) has emerged as a transformative force in drug discovery and development (DDD), revolutionizing many aspects of the process. This survey provides a comprehensive review of recent advancements in AI applications within early drug discovery and post-market drug assessment. It addresses the identification and prioritization of new therapeutic targets, prediction of drug-target interaction (DTI), design of novel drug-like molecules, and assessment of the clinical efficacy of new medications.

View Article and Find Full Text PDF

Recent improvements in proteomics technologies have fundamentally altered our capacities to characterize human biology. There is an ever-growing interest in using these novel methods for studying the circulating proteome, as blood offers an accessible window into human health. However, every methodological innovation and analytical progress calls for reassessing our existing approaches and routines to ensure that the new data will add value to the greater biomedical research community and avoid previous errors.

View Article and Find Full Text PDF

Despite implementing hundreds of strategies, cancer drug development suffers from a 95% failure rate over 30 years, with only 30% of approved cancer drugs extending patient survival beyond 2.5 months. Adding more criteria without eliminating nonessential ones is impractical and may fall into the "survivorship bias" trap.

View Article and Find Full Text PDF

The quick Sequential Organ Failure Assessment (qSOFA) system identifies an individual's risk to progress to poor sepsis-related outcomes using minimal variables. We used Support Vector Machine, Learning Using Concave and Convex Kernels, and Random Forest to predict an increase in qSOFA score using electronic health record (EHR) data, electrocardiograms (ECG), and arterial line signals. We structured physiological signals data in a tensor format and used Canonical Polyadic/Parallel Factors (CP) decomposition for feature reduction.

View Article and Find Full Text PDF

Non-clear cell renal cell carcinomas (non-ccRCCs) encompass diverse malignant and benign tumors. Refinement of differential diagnosis biomarkers, markers for early prognosis of aggressive disease, and therapeutic targets to complement immunotherapy are current clinical needs. Multi-omics analyses of 48 non-ccRCCs compared with 103 ccRCCs reveal proteogenomic, phosphorylation, glycosylation, and metabolic aberrations in RCC subtypes.

View Article and Find Full Text PDF

Few studies examining the patient outcomes of concurrent neurological manifestations during acute COVID-19 leveraged multinational cohorts of adults and children or distinguished between central and peripheral nervous system (CNS vs. PNS) involvement. Using a federated multinational network in which local clinicians and informatics experts curated the electronic health records data, we evaluated the risk of prolonged hospitalization and mortality in hospitalized COVID-19 patients from 21 healthcare systems across 7 countries.

View Article and Find Full Text PDF
Article Synopsis
  • Despite the notable advancements in immunotherapy for cancer, only a small percentage (less than 20%) show lasting responses to immune checkpoint blockade, leading researchers to consider combination therapies that target multiple immune evasion strategies.
  • Researchers analyzed data from over 1,000 tumors across ten cancers to identify seven distinct immune subtypes, examining their unique genomic, epigenetic, transcriptomic, and proteomic characteristics.
  • By investigating kinase activities linked to these immune subtypes, the study uncovered potential therapeutic targets that could improve future immunotherapy approaches and precision medicine.
View Article and Find Full Text PDF

Background: Omics characterization of pancreatic adenocarcinoma tissue is complicated by the highly heterogeneous and mixed populations of cells. We evaluate the feasibility and potential benefit of using a coring method to enrich specific regions from bulk tissue and then perform proteogenomic analyses.

Methods: We used the Biopsy Trifecta Extraction (BioTExt) technique to isolate cores of epithelial-enriched and stroma-enriched tissue from pancreatic tumor and adjacent tissue blocks.

View Article and Find Full Text PDF
Article Synopsis
  • The Human Proteome Project (HPP), launched in 2010 by the Human Proteome Organization (HUPO), aims to identify all human proteins and integrate proteomics into studies of health and disease.
  • As of April 2023, 93% of predicted proteins from the human genome have been detected, demonstrating significant advancements in the creation of a comprehensive protein parts list.
  • The project is now transitioning to a Grand Challenge Project that focuses on understanding the functions of these proteins and their roles within biological networks and pathways.
View Article and Find Full Text PDF

Electronic health records (EHRs) contain a wealth of information that can be used to further precision health. One particular data element in EHRs that is not only under-utilized but oftentimes unaccounted for is missing data. However, missingness can provide valuable information about comorbidities and best practices for monitoring patients, which could save lives and reduce burden on the healthcare system.

View Article and Find Full Text PDF

Electronic health records (EHRs) contain a wealth of information that can be used to further precision health. One particular data element in EHRs that is not only under-utilized but oftentimes unaccounted for is missing data. However, missingness can provide valuable information about comorbidities and best practices for monitoring patients, which could save lives and reduce burden on the healthcare system.

View Article and Find Full Text PDF

Protein structure prediction with neural networks is a powerful new method for linking protein sequence, structure, and function, but structures have generally been predicted for only a single isoform of each gene, neglecting splice variants. To investigate the structural implications of alternative splicing, we used AlphaFold2 to predict the structures of more than 11,000 human isoforms. We employed multiple metrics to identify splicing-induced structural alterations, including template matching score, secondary structure composition, surface charge distribution, radius of gyration, accessibility of post-translational modification sites, and structure-based function prediction.

View Article and Find Full Text PDF

Pancreatic ductal adenocarcinoma (PDAC) is one of the most lethal cancer types, partly because it is frequently identified at an advanced stage, when surgery is no longer feasible. Therefore, early detection using minimally invasive methods such as blood tests may improve outcomes. However, studies to discover molecular signatures for the early detection of PDAC using blood tests have only been marginally successful.

View Article and Find Full Text PDF

Background: Multisystem inflammatory syndrome in children (MIS-C) is a severe complication of SARS-CoV-2 infection. It remains unclear how MIS-C phenotypes vary across SARS-CoV-2 variants. We aimed to investigate clinical characteristics and outcomes of MIS-C across SARS-CoV-2 eras.

View Article and Find Full Text PDF

Background: Characterizing Post-Acute Sequelae of COVID (SARS-CoV-2 Infection), or has been challenging due to the multitude of sub-phenotypes, temporal attributes, and definitions. Scalable characterization of PASC sub-phenotypes can enhance screening capacities, disease management, and treatment planning.

Methods: We conducted a retrospective multi-centre observational cohort study, leveraging longitudinal electronic health record (EHR) data of 30,422 patients from three healthcare systems in the Consortium for the Clinical Characterization of COVID-19 by EHR (4CE).

View Article and Find Full Text PDF

We introduce a pioneering approach that integrates pathology imaging with transcriptomics and proteomics to identify predictive histology features associated with critical clinical outcomes in cancer. We utilize 2,755 H&E-stained histopathological slides from 657 patients across 6 cancer types from CPTAC. Our models effectively recapitulate distinctions readily made by human pathologists: tumor vs.

View Article and Find Full Text PDF

We characterized a prospective endometrial carcinoma (EC) cohort containing 138 tumors and 20 enriched normal tissues using 10 different omics platforms. Targeted quantitation of two peptides can predict antigen processing and presentation machinery activity, and may inform patient selection for immunotherapy. Association analysis between MYC activity and metformin treatment in both patients and cell lines suggests a potential role for metformin treatment in non-diabetic patients with elevated MYC activity.

View Article and Find Full Text PDF

Physical and psychological symptoms lasting months following an acute COVID-19 infection are now recognized as post-acute sequelae of COVID-19 (PASC). Accurate tools for identifying such patients could enhance screening capabilities for the recruitment for clinical trials, improve the reliability of disease estimates, and allow for more accurate downstream cohort analysis. In this retrospective cohort study, we analyzed the EHR of hospitalized COVID-19 patients across three healthcare systems to develop a pipeline for better identifying patients with persistent PASC symptoms (dyspnea, fatigue, or joint pain) after their SARS-CoV-2 infection.

View Article and Find Full Text PDF

Background: In electronic health records, patterns of missing laboratory test results could capture patients' course of disease as well as ​​reflect clinician's concerns or worries for possible conditions. These patterns are often understudied and overlooked. This study aims to identify informative patterns of missingness among laboratory data collected across 15 healthcare system sites in three countries for COVID-19 inpatients.

View Article and Find Full Text PDF

De novo protein design generally consists of two steps, including structure and sequence design. Many protein design studies have focused on sequence design with scaffolds adapted from native structures in the PDB, which renders novel areas of protein structure and function space unexplored. We developed FoldDesign to create novel protein folds from specific secondary structure (SS) assignments through sequence-independent replica-exchange Monte Carlo (REMC) simulations.

View Article and Find Full Text PDF