Publications by authors named "Isaac Kohane"

Continuous glucose monitors (CGM) provide valuable insights about glycemic control that aid in diabetes management. However, interpreting metrics and charts and synthesizing them into linguistic summaries is often non-trivial for patients and providers. The advent of large language models (LLMs) has enabled real-time text generation and summarization of medical data.

View Article and Find Full Text PDF

Background: Historically, the readability of consent forms in medicine have been above the average reading level of patients. This can create challenges in obtaining truly informed consent, but the implications on clinical trial participant retention are not fully explored. To address this gap, we seek to analyze clinical trial consent forms by determining their readability and relationship with the associated trial's participant dropout rate.

View Article and Find Full Text PDF

Background: Patient notes contain substantial information but are difficult for computers to analyse due to their unstructured format. Large-language models (LLMs), such as Generative Pre-trained Transformer 4 (GPT-4), have changed our ability to process text, but we do not know how effectively they handle medical notes. We aimed to assess the ability of GPT-4 to answer predefined questions after reading medical notes in three different languages.

View Article and Find Full Text PDF

Over the past decade, wearable technology has dramatically changed how patients manage chronic diseases. The widespread availability of on-body sensors, such as heart rate monitors and continuous glucose monitoring (CGM) sensors, has allowed patients to have real-time data about their health. Most of these data are readily available on patients' smartphone applications, where patients can view their current and retrospective data.

View Article and Find Full Text PDF
Article Synopsis
  • The integration of electronic health record (EHR) data with other resources is crucial for rare disease research, relying on the compatibility of ontologies like ICD (for clinical diagnoses) and HPO (for phenotypes).
  • An analysis showed only 2.2% of ICD codes have direct mappings to HPO in UMLS, and less than 50% of ICD codes in a real-world EHR dataset are mapped to HPO, indicating gaps, especially for rarer conditions.
  • The study concludes that interoperability between ICD and HPO is limited, with a need for more established mapping conventions beyond UMLS to enhance data integration.
View Article and Find Full Text PDF

Author affiliations are essential in bibliometric studies, requiring relevant information extraction from free-text affiliations. Precisely determining an author's location from their affiliation is crucial for understanding research networks, collaborations, and geographic distribution. Existing geoparsing tools using regular expressions have limitations due to unstructured and ambiguous affiliations, resulting in erroneous location identification, especially for unconventional variations or misspellings.

View Article and Find Full Text PDF

A small number of cancer patients respond exceptionally well to therapies and survive significantly longer than patients with similar diagnoses. Profiling the germline genetic backgrounds of exceptional responder (ER) patients, with extreme survival times, can yield insights into the germline polymorphisms that influence response to therapy. As ERs showed a high incidence in autoimmune diseases, we hypothesized the differences in autoimmune disease risk could reflect the immune background of ERs and contribute to better cancer treatment responses.

View Article and Find Full Text PDF

Continuous glucose monitors (CGM) provide patients and clinicians with valuable insights about glycemic control that aid in diabetes management. The advent of large language models (LLMs), such as GPT-4, has enabled real-time text generation and summarization of medical data. Further, recent advancements have enabled the integration of data analysis features in chatbots, such that raw data can be uploaded and analyzed when prompted.

View Article and Find Full Text PDF
Article Synopsis
  • * It enables the estimation of personalized treatment effects, allowing healthcare providers to make tailored clinical decisions based on individual patient characteristics.
  • * The article discusses the advantages of causal ML over traditional methods, outlines essential components for its application, and offers guidelines for its reliable use in clinical settings.
View Article and Find Full Text PDF

The COVID-19 pandemic generated tremendous interest in using real world data (RWD). Many consortia across the public and private sectors formed in 2020 with the goal of rapidly producing high-quality evidence from RWD to guide medical decision-making, public health priorities, and more. Experiences were gathered from five large consortia on rapid multi-institutional evidence generation during the COVID-19 pandemic.

View Article and Find Full Text PDF

Characterization of Parkinson's disease (PD) progression using real-world evidence could guide clinical trial design and identify subpopulations. Efforts to curate research populations, the increasing availability of real-world data, and advances in natural language processing, particularly large language models, allow for a more granular comparison of populations than previously possible. This study includes two research populations and two real-world data-derived (RWD) populations.

View Article and Find Full Text PDF
Article Synopsis
  • Recent advancements in genomics for diagnosing rare diseases focus on "N-of-1" analyses, allowing for tailored studies on individual patients with ultra-rare conditions.
  • The Undiagnosed Diseases Network (UDN) enables collaborative research across various U.S. clinical and research centers, which enhances the ability to analyze whole genome sequencing data from multiple patients simultaneously.
  • Introducing a new software package, RaMeDiES, the team provides tools for automated comparisons of genomic data, leading to novel disease associations and improving overall understanding of genetic links to these rare diseases.
View Article and Find Full Text PDF

Characterization of Parkinson's disease (PD) progression using real-world evidence could guide clinical trial design and identify subpopulations. Efforts to curate research populations, the increasing availability of real-world data and recent advances in natural language processing, particularly large language models, allow for a more granular comparison of populations and the methods of data collection describing these populations than previously possible. This study includes two research populations and two real-world data derived (RWD) populations.

View Article and Find Full Text PDF

Background: The evaluation and management of first-time seizure-like events in children can be difficult because these episodes are not always directly observed and might be epileptic seizures or other conditions (seizure mimics). We aimed to evaluate whether machine learning models using real-world data could predict seizure recurrence after an initial seizure-like event.

Methods: This retrospective cohort study compared models trained and evaluated on two separate datasets between Jan 1, 2010, and Jan 1, 2020: electronic medical records (EMRs) at Boston Children's Hospital and de-identified, patient-level, administrative claims data from the IBM MarketScan research database.

View Article and Find Full Text PDF
Article Synopsis
  • Rare Mendelian disorders affect 300-400 million people globally and present significant diagnostic challenges due to the complexity and lack of data.
  • Existing automated tools struggle to identify causal genes for these disorders as there are limited datasets with unpublished cases for evaluation.
  • The authors developed a new computational pipeline that simulates clinical datasets to create realistic patient profiles, allowing researchers to test and improve gene prioritization methods for diagnosing novel genetic conditions.
View Article and Find Full Text PDF

Diabetes is associated with heterogeneous behaviors affecting patients' clinical characteristics and trajectories. This study includes 21,288 patients with type 2 diabetes (women, ages 30 to 65). The cohort was filtered through a set of preprocessing heuristics in order to assure the cohort exhibited a similar clinical trajectory.

View Article and Find Full Text PDF

Purpose: In young adults (18 to 49 years old), investigation of the acute respiratory distress syndrome (ARDS) after severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection has been limited. We evaluated the risk factors and outcomes of ARDS following infection with SARS-CoV-2 in a young adult population.

Methods: A retrospective cohort study was conducted between January 1st, 2020 and February 28th, 2021 using patient-level electronic health records (EHR), across 241 United States hospitals and 43 European hospitals participating in the Consortium for Clinical Characterization of COVID-19 by EHR (4CE).

View Article and Find Full Text PDF