The meaningful use of electronic health records (EHR) continues to progress in the digital era with clinical decision support systems augmented by artificial intelligence. A priority in improving provider experience is to overcome information overload and reduce the cognitive burden so fewer medical errors and cognitive biases are introduced during patient care. One major type of medical error is diagnostic error due to systematic or predictable errors in judgement that rely on heuristics. The potential for clinical natural language processing (cNLP) to model diagnostic reasoning in humans with forward reasoning from data to diagnosis and potentially reduce cognitive burden and medical error has not been investigated. Existing tasks to advance the science in cNLP have largely focused on information extraction and named entity recognition through classification tasks. We introduce a novel suite of tasks coined as Diagnostic Reasoning Benchmarks, Dr.Bench, as a new benchmark for developing and evaluating cNLP models with clinical diagnostic reasoning ability. The suite includes six tasks from ten publicly available datasets addressing clinical text understanding, medical knowledge reasoning, and diagnosis generation. DR.BENCH is the first clinical suite of tasks designed to be a natural language generation framework to evaluate pre-trained language models for diagnostic reasoning. The goal of DR. BENCH is to advance the science in cNLP to support downstream applications in computerized diagnostic decision support and improve the efficiency and accuracy of healthcare providers during patient care. We fine-tune and evaluate the state-of-the-art generative models on DR.BENCH. Experiments show that with domain adaptation pre-training on medical knowledge, the model demonstrated opportunities for improvement when evaluated in DR. BENCH. We share DR. BENCH as a publicly available GitLab repository with a systematic approach to load and evaluate models for the cNLP community. We also discuss the carbon footprint produced during the experiments and encourage future work on DR.BENCH to report the carbon footprint.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9993808 | PMC |
http://dx.doi.org/10.1016/j.jbi.2023.104286 | DOI Listing |
J Neurosurg Spine
January 2025
2Cleveland Clinic Center for Spine Health, Cleveland Clinic, Cleveland; and.
Objective: Spinal fusion is a commonly performed surgical procedure used to relieve pain, deformity, and instability of various spinal pathologies. Although there have been attempts to standardize spinal fusion assessment radiologically, there is currently no unified definition that also considers clinical symptomology. This review attempts to create a more holistic and standardized definition of spinal fusion.
View Article and Find Full Text PDFEur J Neurosci
January 2025
Department of Psychology, National Chengchi University, Taipei, Taiwan.
Word problems are essential for math learning and education, bridging numerical knowledge with real-world applications. Despite their importance, the neural mechanisms underlying word problem solving, especially in children, remain poorly understood. Here, we examine children's cognitive and brain response profiles for arithmetic word problems (AWPs), which involve one-step mathematical operations, and compare them with nonarithmetic word problems (NWPs), structured as parallel narratives without numerical operations.
View Article and Find Full Text PDFIntroduction: Assessment of fitness for flight constitutes one of the core tasks of aeromedical professionals. The value of such evaluations depends on the decision to be based on complete medical information, valid risk methodology, and genuine flight safety indicators. To achieve these goals, the aeromedical practitioner should ensure an evidence-based approach.
View Article and Find Full Text PDFCells
January 2025
Department of Functional and Evolutionary Ecology, University of Vienna, Djerassiplatz 1, A-1030 Vienna, Austria.
Contaminations are challenging for monocultures, as they impact the culture conditions and thus influence the growth of the target organism and the overall biomass composition. In phycology, axenic cultures comprising a single living species are commonly strived for both basic research and industrial applications, because contaminants reduce significance for analytic purposes and interfere with the safety and quality of commercial products. We aimed to establish axenic cultures of , known as the food additive "Spirulina".
View Article and Find Full Text PDFBioengineering (Basel)
December 2024
College of Liberal Arts Faculty of Basic Liberal Art, Hansung University, Seoul 02876, Republic of Korea.
The large language model (LLM) has the potential to be applied to clinical practice. However, there has been scarce study on this in the field of gastroenterology. Aim: This study explores the potential clinical utility of two LLMs in the field of gastroenterology: a customized GPT model and a conventional GPT-4o, an advanced LLM capable of retrieval-augmented generation (RAG).
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!