Publications by authors named "Lucila Ohno Machado"

Recent advancements in large language models (LLMs) like ChatGPT and LLaMA have shown significant potential in medical applications, but their effectiveness is limited by a lack of specialized medical knowledge due to general-domain training. In this study, we developed Me-LLaMA, a new family of open-source medical LLMs that uniquely integrate extensive domain-specific knowledge with robust instruction-following capabilities. Me-LLaMA comprises foundation models (Me-LLaMA 13B and 70B) and their chat-enhanced versions, developed through comprehensive continual pretraining and instruction tuning of LLaMA2 models using both biomedical literature and clinical notes.

View Article and Find Full Text PDF

Objectives: Traditional methods for medical device post-market surveillance often fail to accurately account for operator learning effects, leading to biased assessments of device safety. These methods struggle with non-linearity, complex learning curves, and time-varying covariates, such as physician experience. To address these limitations, we sought to develop a machine learning (ML) framework to detect and adjust for operator learning effects.

View Article and Find Full Text PDF

Objectives: To introduce quantum computing technologies as a tool for biomedical research and highlight future applications within healthcare, focusing on its capabilities, benefits, and limitations.

Target Audience: Investigators seeking to explore quantum computing and create quantum-based applications for healthcare and biomedical research.

Scope: Quantum computing requires specialized hardware, known as quantum processing units, that use quantum bits (qubits) instead of classical bits to perform computations.

View Article and Find Full Text PDF

Recent advancements in large language models (LLMs) such as ChatGPT and LLaMA have hinted at their potential to revolutionize medical applications, yet their application in clinical settings often reveals limitations due to a lack of specialized training on medical-specific data. In response to this challenge, this study introduces Me-LLaMA, a novel medical LLM family that includes foundation models - Me-LLaMA 13/70B, along with their chat-enhanced versions - Me-LLaMA 13/70B-chat, developed through continual pre-training and instruction tuning of LLaMA2 using large medical datasets. Our methodology leverages a comprehensive domain-specific data suite, including a large-scale, continual pre-training dataset with 129B tokens, an instruction tuning dataset with 214k samples, and a new medical evaluation benchmark (MIBE) across six critical medical tasks with 12 datasets.

View Article and Find Full Text PDF

Objective: Blockchain has emerged as a potential data-sharing structure in healthcare because of its decentralization, immutability, and traceability. However, its use in the biomedical domain is yet to be investigated comprehensively, especially from the aspects of implementation and evaluation, by existing blockchain literature reviews. To address this, our review assesses blockchain applications implemented in practice and evaluated with quantitative metrics.

View Article and Find Full Text PDF

Importance: Health care algorithms are used for diagnosis, treatment, prognosis, risk stratification, and allocation of resources. Bias in the development and use of algorithms can lead to worse outcomes for racial and ethnic minoritized groups and other historically marginalized populations such as individuals with lower income.

Objective: To provide a conceptual framework and guiding principles for mitigating and preventing bias in health care algorithms to promote health and health care equity.

View Article and Find Full Text PDF
Article Synopsis
  • Researchers created a smart computer program that can help doctors find a serious heart condition called aortic stenosis (AS) just by looking at ultrasound videos of the heart.
  • They trained this program using a lot of heart videos and tested it with different groups of patients to make sure it works well.
  • The program correctly identified severe AS almost 98% of the time and can be used easily in clinics to check patients without needing extra complicated tools.
View Article and Find Full Text PDF
Article Synopsis
  • The All of Us Research Program aims to understand opioid use by analyzing data from a diverse group of over 214,000 participants, linking survey responses and electronic health records.
  • The study found that lifetime opioid use prevalence was 4% for street opioids and 9% for nonmedical prescription opioids, with men more likely to have used opioids, while foreign-born individuals showed reduced risks.
  • The findings will help track national trends in opioid prescription, usage, and opioid use disorders (OUD) over time, particularly among underrepresented groups, contributing valuable insights to the ongoing opioid crisis.
View Article and Find Full Text PDF

The All of Us (AoU) Research Program is making available one of the largest and most diverse collections of health data in the US to researchers. Using the All of Us database, we evaluated family and personal histories of five common types of cancer in 89,453 individuals, comparing these data to 24,305 participants from the 2015 National Health Interview Survey (NHIS). Comparing datasets, we found similar family cancer history (33%) rates, but higher personal cancer history in the AoU dataset (9.

View Article and Find Full Text PDF

Objectives: Despite growing enthusiasm surrounding the utility of clinical informatics to improve cancer outcomes, data availability remains a persistent bottleneck to progress. Difficulty combining data with protected health information often limits our ability to aggregate larger more representative datasets for analysis. With the rise of machine learning techniques that require increasing amounts of clinical data, these barriers have magnified.

View Article and Find Full Text PDF

Aims: Various cardiovascular risk prediction models have been developed for patients with type 2 diabetes mellitus. Yet few models have been validated externally. We perform a comprehensive validation of existing risk models on a heterogeneous population of patients with type 2 diabetes using secondary analysis of electronic health record data.

View Article and Find Full Text PDF

Objective: This qualitative study aimed to understand patient and researcher perspectives regarding consent and data-sharing preferences for research and a patient-centered system to manage consent and data-sharing preferences.

Materials And Methods: We conducted focus groups with patient and researcher participants recruited from three academic health centers via snowball sampling. Discussions focused on perspectives on the use of electronic health record (EHR) data for research.

View Article and Find Full Text PDF

All of Us is a biorepository aiming to advance biomedical research by providing various types of data in diverse human populations. Here we present a demonstration project validating the program's genomic data in 98,622 participants. We sought to replicate known genetic associations for three diseases (atrial fibrillation [AF], coronary artery disease, type 2 diabetes [T2D]) and two quantitative traits (height and low-density lipoprotein [LDL]) by conducting common and rare variant analyses.

View Article and Find Full Text PDF

Background: Validating new algorithms, such as methods to disentangle intrinsic treatment risk from risk associated with experiential learning of novel treatments, often requires knowing the ground truth for data characteristics under investigation. Since the ground truth is inaccessible in real world data, simulation studies using synthetic datasets that mimic complex clinical environments are essential. We describe and evaluate a generalizable framework for injecting hierarchical learning effects within a robust data generation process that incorporates the magnitude of intrinsic risk and accounts for known critical elements in clinical data relationships.

View Article and Find Full Text PDF

Objective: We aimed to develop a distributed, immutable, and highly available cross-cloud blockchain system to facilitate federated data analysis activities among multiple institutions.

Materials And Methods: We preprocessed 9166 COVID-19 Structured Query Language (SQL) code, summary statistics, and user activity logs, from the GitHub repository of the Reliable Response Data Discovery for COVID-19 (R2D2) Consortium. The repository collected local summary statistics from participating institutions and aggregated the global result to a COVID-19-related clinical query, previously posted by clinicians on a website.

View Article and Find Full Text PDF

Linking data across studies offers an opportunity to enrich data sets and provide a stronger basis for data-driven models for biomedical discovery and/or prognostication. Several techniques to link records have been proposed, and some have been implemented across data repositories holding molecular and clinical data. Not all these techniques guarantee appropriate privacy protection; there are trade-offs between (a) simple strategies that can be associated with data that will be linked and shared with any party and (b) more complex strategies that preserve the privacy of individuals across parties.

View Article and Find Full Text PDF

Background: Underrepresented persons are often not included in biomedical research. It is unknown if the general Asian American population is being represented in . The purpose of this study was to compare the Asian demographic data in the cohort with the Asian nationally representative data from the American Community Survey.

View Article and Find Full Text PDF

The World Health Organization recently defined hypertension and type 2 diabetes (T2D) as modifiable comorbidities leading to dementia and Alzheimer's disease. In the United States (US), hypertension and T2D are health disparities, with higher prevalence seen for Black and Hispanic minority groups compared to the majority White population. We hypothesized that elevated prevalence of hypertension and T2D risk factors in Black and Hispanic groups may be associated with dementia disparities.

View Article and Find Full Text PDF

Background & Aims: We compared the safety and effectiveness of tumor necrosis factor α (TNF-α) antagonists vs vedolizumab vs ustekinumab in patients with Crohn's disease (CD) in a multicenter cohort (CA-IBD).

Methods: We created an electronic health record-based cohort of adult patients with CD who were initiating a new biologic agent (TNF-α antagonists, ustekinumab, vedolizumab) from 5 health systems in California between 2010 and 2017. We compared the risk of serious infections (safety) and all-cause hospitalization and inflammatory bowel disease-related surgery (effectiveness) between different biologic classes using propensity score (PS) matching.

View Article and Find Full Text PDF

Accurate, highly specific immunoassays for severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) are needed to evaluate seroprevalence. This study investigated the concordance of results across four immunoassays targeting different antigens for sera collected at the beginning of the SARS-CoV-2 pandemic in the United States. Specimens from All of Us participants contributed between January and March 2020 were tested using the Abbott Architect SARS-CoV-2 IgG (immunoglobulin G) assay (Abbott) and the EuroImmun SARS-CoV-2 enzyme-linked immunosorbent assay (ELISA) (EI).

View Article and Find Full Text PDF

Concerns regarding inappropriate leakage of sensitive personal information as well as unauthorized data use are increasing with the growth of genomic data repositories. Therefore, privacy and security of genomic data have become increasingly important and need to be studied. With many proposed protection techniques, their applicability in support of biomedical research should be well understood.

View Article and Find Full Text PDF

The Cox proportional hazards model is a popular semi-parametric model for survival analysis. In this paper, we aim at developing a federated algorithm for the Cox proportional hazards model over vertically partitioned data (i.e.

View Article and Find Full Text PDF

Introduction: The NIH All of Us Research Program will have the scale and scope to enable research for a wide range of diseases, including cancer. The program's focus on diversity and inclusion promises a better understanding of the unequal burden of cancer. Preliminary cancer ascertainment in the All of Us cohort from two data sources (self-reported versus electronic health records (EHR)) is considered.

View Article and Find Full Text PDF

The Research Program seeks to engage at least one million diverse participants to advance precision medicine and improve human health. We describe here the cloud-based Researcher Workbench that uses a data passport model to democratize access to analytical tools and participant information including survey, physical measurement, and electronic health record (EHR) data. We also present validation study findings for several common complex diseases to demonstrate use of this novel platform in 315,000 participants, 78% of whom are from groups historically underrepresented in biomedical research, including 49% self-reporting non-White races.

View Article and Find Full Text PDF