An important problem impacting healthcare is the lack of available experts. Machine learning (ML) models may help resolve this by aiding in screening and diagnosing patients. However, creating large, representative datasets to train models is expensive. We evaluated large language models (LLMs) for data creation. Using Autism Spectrum Disorders (ASD), we prompted GPT-3.5 and GPT-4 to generate 4,200 synthetic examples of behaviors to augment existing medical observations. Our goal is to label behaviors corresponding to autism criteria and improve model accuracy with synthetic training data. We used a BERT classifier pretrained on biomedical literature to assess differences in performance between models. A random sample (N=140) from the LLM-generated data was also evaluated by a clinician and found to contain 83% correct behavioral example-label pairs. Augmenting the dataset increased recall by 13% but decreased precision by 16%. Future work will investigate how different synthetic data characteristics affect ML outcomes.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11141799PMC

Publication Analysis

Top Keywords

large language
8
language models
8
synthetic data
8
models
5
data
5
utilizing large
4
models generate
4
synthetic
4
generate synthetic
4
data increase
4

Similar Publications

Introduction: Mental disorders, such as anxiety and depression, significantly impacted global populations in 2019 and 2020, with COVID-19 causing a surge in prevalence. They affect 13.4% of the people worldwide, and 21% of Iranians have experienced them.

View Article and Find Full Text PDF

Perception of emotion conveyed through language is influenced by embodied experiences obtained from social interactions, which may vary across different cultures. To explore cross-cultural differences in the perception of emotion between Chinese and English speakers, this study collected norms of valence and arousal from 322 native Mandarin speakers for 4923 Chinese words translated from Warriner et al., (Behavior Research Methods, 45, 1191-1207, 2013).

View Article and Find Full Text PDF

Although a large body of work has explored the mechanisms underlying metaphor comprehension, less research has focused on spontaneous metaphor production. Previous research suggests that reasoning about analogies can induce a relational mindset, which causes a greater focus on underlying abstract similarities. We explored how inducing a relational mindset may increase the tendency to use metaphors to describe topics.

View Article and Find Full Text PDF

Based on a micro-entity perspective, this paper empirically examines the effect of regional integration on the efficiency of enterprise resource allocation. Firstly, regional integration can significantly reduce the degree of resource mismatch of enterprises and improve the efficiency of resource allocation of enterprises overall. Secondly, regional integration can improve the efficiency of resource allocation of enterprises mainly through three channels, which are alleviating the financing constraints of enterprises, improving the degree of talent agglomeration, and reducing the operating costs of enterprises.

View Article and Find Full Text PDF

Genomic language models: opportunities and challenges.

Trends Genet

January 2025

Computer Science Division, University of California, Berkeley, CA, USA; Department of Statistics, University of California, Berkeley, CA, USA; Center for Computational Biology, University of California, Berkeley, CA, USA. Electronic address:

Large language models (LLMs) are having transformative impacts across a wide range of scientific fields, particularly in the biomedical sciences. Just as the goal of natural language processing is to understand sequences of words, a major objective in biology is to understand biological sequences. Genomic language models (gLMs), which are LLMs trained on DNA sequences, have the potential to significantly advance our understanding of genomes and how DNA elements at various scales interact to give rise to complex functions.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!