Natural language processing (NLP) holds the promise of effectively analyzing patient record data to reduce cognitive load on physicians and clinicians in patient care, clinical research, and hospital operations management. A critical need in developing such methods is the "ground truth" dataset needed for training and testing the algorithms. Beyond localizable, relatively simple tasks, ground truth creation is a significant challenge because medical experts, just as physicians in patient care, have to assimilate vast amounts of data in EHR systems. To mitigate potential inaccuracies of the cognitive challenges, we present an iterative vetting approach for creating the ground truth for complex NLP tasks. In this paper, we present the methodology, and report on its use for an automated problem list generation task, its effect on the ground truth quality and system accuracy, and lessons learned from the effort.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5543376 | PMC |
Sci Data
January 2025
Faculty of Computing, Engineering and Built Environment, Birmingham City University, Birmingham, B4 7XG, UK.
Automatic Compliance Checking (ACC) within the Architecture, Engineering, and Construction (AEC) sector necessitates automating the interpretation of building regulations to achieve its full potential. Converting textual rules into machine-readable formats is challenging due to the complexities of natural language and the scarcity of resources for advanced Machine Learning (ML). Addressing these challenges, we introduce CODE-ACCORD, a dataset of 862 sentences from the building regulations of England and Finland.
View Article and Find Full Text PDFInt J Oral Maxillofac Surg
January 2025
Department of Oral and Maxillofacial Surgery, Peking University School and Hospital of Stomatology, Beijing, China; National Center for Stomatology, Beijing, China; National Clinical Research Center for Oral Diseases, Beijing, China; National Engineering Research Center of Oral Biomaterials and Digital Medical Devices, Beijing, China. Electronic address:
With developments in computer science and technology, great progress has been made in three-dimensional (3D) ultrasound. Recently, ultrasound-based 3D bone modelling has attracted much attention, and its accuracy has been studied for the femur, tibia, and spine. The use of ultrasound allows data for bone surface to be acquired non-invasively and without radiation.
View Article and Find Full Text PDFJ Arthroplasty
January 2025
Orthopedic Surgery Artificial Intelligence Laboratory, Department of Orthopedic Surgery, Mayo Clinic, Rochester, MN, USA; Mayo Clinic Department of Orthopedic Surgery, Mayo Clinic, Rochester, MN, USA. Electronic address:
Background: Minimum joint space width (mJSW) is an important continuous quantitative metric of osteoarthritis progression in the knee. The purpose of this study was to develop an automated measurement algorithm for mJSW in the medial and lateral compartments of the knee that can flexibly handle native knees as well as knees after arthroplasty.
Methods: We developed an end-to-end algorithm consisting of a deep learning segmentation model plus a computer vision algorithm to measure mJSW in the medial and lateral compartments of the knee.
J Hazard Mater
January 2025
Faculty of Data Science, Musashino University, 3-3-3 Ariake Koto-ku, Tokyo 135-8181, Japan. Electronic address:
This paper outlines key machine learning principles, focusing on the use of XGBoost and SHAP values to assist researchers in avoiding analytical pitfalls. XGBoost builds models by incrementally adding decision trees, each addressing the errors of the previous one, which can result in inflated feature importance scores due to the method's emphasis on misclassified examples. While SHAP values provide a theoretically robust way to interpret predictions, their dependence on model structure and feature interactions can introduce biases.
View Article and Find Full Text PDFBrief Bioinform
November 2024
Department of Biology, University of Padova, Via U.Bassi 58/ B, 35131, Italy.
Shallow whole-genome sequencing (sWGS) offers a cost-effective approach to detect copy number alterations (CNAs). However, there remains a gap for a standardized workflow specifically designed for sWGS analysis. To address this need, in this work we present SAMURAI, a bioinformatics pipeline specifically designed for analyzing CNAs from sWGS data in a standardized and reproducible manner.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!