Publications by authors named "Ben O Petrazzini"

Article Synopsis
  • Mode of inheritance (MOI) is crucial for understanding pathogenic variants, yet most variants lack this information, particularly impacting recessive diseases.
  • MOI-Pred and ConMOI are new tools developed to predict variant pathogenicity by incorporating MOI, with MOI-Pred focusing on both dominant and recessive variants through evolutionary and functional data.
  • Both tools have shown high accuracy in benchmarks and real-world evaluations, with ConMOI outperforming individual methods, underscoring the benefits of using a consensus approach for variant predictions.
View Article and Find Full Text PDF
Article Synopsis
  • Researchers created a machine learning tool called ML-GPS to help find genetic factors linked to chronic diseases, aiding in drug development.
  • This tool combines genetic data from the UK Biobank with advanced modeling techniques to predict disease phenotypes and their associations with various genetic variants.
  • ML-GPS significantly increases the number of potential drug targets and can identify both established and promising target-disease relationships, including those related to Parkinson's and cardiovascular diseases.
View Article and Find Full Text PDF
Article Synopsis
  • - The study investigates the effectiveness of the LiverRisk score and SAFE score in predicting liver fibrosis compared to established indices (FIB-4 and APRI) across various populations, including those with metabolic dysfunction-associated steatotic liver disease (MASLD).
  • - Data from over 32,000 participants across different cohorts were analyzed using statistical methods to evaluate the scores' performance in forecasting liver-related outcomes and identifying fibrosis.
  • - Results indicate that while the LiverRisk and SAFE scores show promise, they do not consistently outperform existing methods, particularly in diverse populations, highlighting the need for further validation before widespread clinical adoption.
View Article and Find Full Text PDF
Article Synopsis
  • Coronary artery disease (CAD) involves a mix of risk factors and processes, and a new machine learning-based score can help track its progression and severity.
  • Researchers tested this score against rare gene variants in different biobanks and found significant associations in 17 genes, with 14 receiving prior support related to CAD.
  • The study suggests that there are likely more ultrarare gene variants associated with CAD, highlighting how digital tools can improve genetic research in complex diseases.
View Article and Find Full Text PDF

Background: Diet is a key modifiable risk factor of coronary artery disease (CAD). However, the causal effects of specific dietary traits on CAD risk remain unclear. With the expansion of dietary data in population biobanks, Mendelian randomization (MR) could help enable the efficient estimation of causality in diet-disease associations.

View Article and Find Full Text PDF

Studies have shown that drug targets with human genetic support are more likely to succeed in clinical trials. Hence, a tool integrating genetic evidence to prioritize drug target genes is beneficial for drug discovery. We built a genetic priority score (GPS) by integrating eight genetic features with drug indications from the Open Targets and SIDER databases.

View Article and Find Full Text PDF

Background: Venous thromboembolism (VTE) is a major cause of morbidity and mortality worldwide. Current risk assessment tools, such as the Caprini and Padua scores and Wells criteria, have limitations in their applicability and accuracy. This study aimed to develop machine learning models using structured electronic health record data to predict diagnosis and 1-year risk of VTE.

View Article and Find Full Text PDF

Metabolic dysfunction-associated steatotic liver disease (MASLD) affects 30% of the global population but is often underdiagnosed. To fill this diagnostic gap, we developed a digital score reflecting presence and severity of MASLD. We fitted a machine learning model to electronic health records from 37,212 UK Biobank participants with proton density fat fraction measurements and/or a MASLD diagnosis to generate a "MASLD score".

View Article and Find Full Text PDF

Systemic autoimmune rheumatic diseases (SARDs) can lead to irreversible damage if left untreated, yet these patients often endure long diagnostic journeys before being diagnosed and treated. Machine learning may help overcome the challenges of diagnosing SARDs and inform clinical decision-making. Here, we developed and tested a machine learning model to identify patients who should receive rheumatological evaluation for SARDs using longitudinal electronic health records of 161,584 individuals from two institutions.

View Article and Find Full Text PDF

Background: Causality between plasma triglyceride (TG) levels and atherosclerotic cardiovascular disease (ASCVD) risk remains controversial despite more than four decades of study and two recent landmark trials, STRENGTH, and REDUCE-IT. Further unclear is the association between TG levels and non-atherosclerotic diseases across organ systems.

Methods: Here, we conducted a phenome-wide, two-sample Mendelian randomization (MR) analysis using inverse-variance weighted (IVW) regression to systematically infer the causal effects of plasma TG levels on 2600 disease traits in the European ancestry population of UK Biobank.

View Article and Find Full Text PDF

Background: Binary diagnosis of coronary artery disease does not preserve the complexity of disease or quantify its severity or its associated risk with death; hence, a quantitative marker of coronary artery disease is warranted. We evaluated a quantitative marker of coronary artery disease derived from probabilities of a machine learning model.

Methods: In this cohort study, we developed and validated a coronary artery disease-predictive machine learning model using 95 935 electronic health records and assessed its probabilities as in-silico scores for coronary artery disease (ISCAD; range 0 [lowest probability] to 1 [highest probability]) in participants in two longitudinal biobank cohorts.

View Article and Find Full Text PDF

Background: Clinical features from electronic health records (EHRs) can be used to build a complementary tool to predict coronary artery disease (CAD) susceptibility.

Objectives: The purpose of this study was to determine whether an EHR score can improve CAD prediction and reclassification 1 year before diagnosis, beyond conventional clinical guidelines as determined by the pooled cohort equations (PCE) and a polygenic risk score for CAD.

Methods: We applied a machine learning framework using clinical features from the EHR in a multiethnic, clinical care cohort (BioMe) comprising 555 CAD cases and 6,349 control subjects and in a population-based cohort (UK Biobank) comprising 3,130 CAD cases and 378,344 control subjects for external validation.

View Article and Find Full Text PDF

Importance: Population-based assessment of disease risk associated with gene variants informs clinical decisions and risk stratification approaches.

Objective: To evaluate the population-based disease risk of clinical variants in known disease predisposition genes.

Design, Setting, And Participants: This cohort study included 72 434 individuals with 37 780 clinical variants who were enrolled in the BioMe Biobank from 2007 onwards with follow-up until December 2020 and the UK Biobank from 2006 to 2010 with follow-up until June 2020.

View Article and Find Full Text PDF

Background: Missing data is a common issue in different fields, such as electronics, image processing, medical records and genomics. They can limit or even bias the posterior analysis. The data collection process can lead to different distribution, frequency, and structure of missing data points.

View Article and Find Full Text PDF