A fundamental goal of evaluating the performance of a clinical model is to ensure it performs well across a diverse intended patient population. A primary challenge is that the data used in model development and testing often consist of many overlapping, heterogeneous patient subgroups that may not be explicitly defined or labeled. While a model's average performance on a dataset may be high, the model can have significantly lower performance for certain subgroups, which may be hard to detect.
View Article and Find Full Text PDFIEEE J Biomed Health Inform
November 2024
The future of artificial intelligence (AI) safety is expected to include bias mitigation methods from development to application. The complexity and integration of these methods could grow in conjunction with advances in AI and human-AI interactions. Numerous methods are being proposed to mitigate bias, but without a structured way to compare their strengths and weaknesses.
View Article and Find Full Text PDFIn the past decade, artificial intelligence (AI) algorithms have made promising impacts in many areas of healthcare. One application is AI-enabled prioritization software known as computer-aided triage and notification (CADt). This type of software as a medical device is intended to prioritize reviews of radiological images with time-sensitive findings, thus shortening the waiting time for patients with these findings.
View Article and Find Full Text PDFPurpose: Synthetic datasets hold the potential to offer cost-effective alternatives to clinical data, ensuring privacy protections and potentially addressing biases in clinical data. We present a method leveraging such datasets to train a machine learning algorithm applied as part of a computer-aided detection (CADe) system.
Approach: Our proposed approach utilizes clinically acquired computed tomography (CT) scans of a physical anthropomorphic phantom into which manufactured lesions were inserted to train a machine learning algorithm.
Background: Positive and negative likelihood ratios (PLR and NLR) are important metrics of accuracy for diagnostic devices with a binary output. However, the properties of Bayesian and frequentist interval estimators of PLR/NLR have not been extensively studied and compared. In this study, we explore the potential use of the Bayesian method for interval estimation of PLR/NLR, and, more broadly, for interval estimation of the ratio of two independent proportions.
View Article and Find Full Text PDFBackground And Aims: The American Society for Gastrointestinal Endoscopy (ASGE) AI Task Force along with experts in endoscopy, technology space, regulatory authorities, and other medical subspecialties initiated a consensus process that analyzed the current literature, highlighted potential areas, and outlined the necessary research in artificial intelligence (AI) to allow a clearer understanding of AI as it pertains to endoscopy currently.
Methods: A modified Delphi process was used to develop these consensus statements.
Results: Statement 1: Current advances in AI allow for the development of AI-based algorithms that can be applied to endoscopy to augment endoscopist performance in detection and characterization of endoscopic lesions.
Radiomics, the science of extracting quantifiable data from routine medical images, is a powerful tool that has many potential applications in oncology. The Response Evaluation Criteria in Solid Tumors Working Group (RWG) held a workshop in May 2022, which brought together various stakeholders to discuss the potential role of radiomics in oncology drug development and clinical trials, particularly with respect to response assessment. This article summarizes the results of that workshop, reviewing radiomics for the practicing oncologist and highlighting the work that needs to be done to move forward the incorporation of radiomics into clinical trials.
View Article and Find Full Text PDFThe adoption of artificial intelligence (AI) tools in medicine poses challenges to existing clinical workflows. This commentary discusses the necessity of context-specific quality assurance (QA), emphasizing the need for robust QA measures with quality control (QC) procedures that encompass (1) acceptance testing (AT) before clinical use, (2) continuous QC monitoring, and (3) adequate user training. The discussion also covers essential components of AT and QA, illustrated with real-world examples.
View Article and Find Full Text PDFPurpose: Endometrial cancer (EC) is the most common gynecologic malignancy in the United States, and atypical endometrial hyperplasia (AEH) is considered a high-risk precursor to EC. Hormone therapies and hysterectomy are practical treatment options for AEH and early-stage EC. Some patients prefer hormone therapies for reasons such as fertility preservation or being poor surgical candidates.
View Article and Find Full Text PDFPurpose: Understanding an artificial intelligence (AI) model's ability to generalize to its target population is critical to ensuring the safe and effective usage of AI in medical devices. A traditional generalizability assessment relies on the availability of large, diverse datasets, which are difficult to obtain in many medical imaging applications. We present an approach for enhanced generalizability assessment by examining the decision space beyond the available testing data distribution.
View Article and Find Full Text PDFBackground: Bone health and fracture risk are known to be correlated with stiffness. Both micro-finite element analysis (μFEA) and mechanical testing of additive manufactured phantoms are useful approaches for estimating mechanical properties of trabecular bone-like structures. However, it is unclear if measurements from the two approaches are consistent.
View Article and Find Full Text PDFJ Med Imaging (Bellingham)
September 2023
Purpose: To introduce developers to medical device regulatory processes and data considerations in artificial intelligence and machine learning (AI/ML) device submissions and to discuss ongoing AI/ML-related regulatory challenges and activities.
Approach: AI/ML technologies are being used in an increasing number of medical imaging devices, and the fast evolution of these technologies presents novel regulatory challenges. We provide AI/ML developers with an introduction to U.
Endometrial cancer (EC) is the most common gynecologic malignancy in the US and complex atypical hyperplasia (CAH) is considered a high-risk precursor to EC. Treatment options for CAH and early-stage EC include hormone therapies and hysterectomy with the former preferred by certain patients, e.g.
View Article and Find Full Text PDFData drift refers to differences between the data used in training a machine learning (ML) model and that applied to the model in real-world operation. Medical ML systems can be exposed to various forms of data drift, including differences between the data sampled for training and used in clinical operation, differences between medical practices or context of use between training and clinical use, and time-related changes in patient populations, disease patterns, and data acquisition, to name a few. In this article, we first review the terminology used in ML literature related to data drift, define distinct types of drift, and discuss in detail potential causes within the context of medical applications with an emphasis on medical imaging.
View Article and Find Full Text PDFImportance: An accurate and robust artificial intelligence (AI) algorithm for detecting cancer in digital breast tomosynthesis (DBT) could significantly improve detection accuracy and reduce health care costs worldwide.
Objectives: To make training and evaluation data for the development of AI algorithms for DBT analysis available, to develop well-defined benchmarks, and to create publicly available code for existing methods.
Design, Setting, And Participants: This diagnostic study is based on a multi-institutional international grand challenge in which research teams developed algorithms to detect lesions in DBT.
Purpose: Machine learning algorithms are best trained with large quantities of accurately annotated samples. While natural scene images can often be labeled relatively cheaply and at large scale, obtaining accurate annotations for medical images is both time consuming and expensive. In this study, we propose a cooperative labeling method that allows us to make use of weakly annotated medical imaging data for the training of a machine learning algorithm.
View Article and Find Full Text PDFRapid advances in artificial intelligence (AI) and machine learning, and specifically in deep learning (DL) techniques, have enabled broad application of these methods in health care. The promise of the DL approach has spurred further interest in computer-aided diagnosis (CAD) development and applications using both "traditional" machine learning methods and newer DL-based methods. We use the term CAD-AI to refer to this expanded clinical decision support environment that uses traditional and DL-based AI methods.
View Article and Find Full Text PDFBackground: The Basic Local Alignment Search Tool (BLAST) is a suite of commonly used algorithms for identifying matches between biological sequences. The user supplies a database file and query file of sequences for BLAST to find identical sequences between the two. The typical millions of database and query sequences make BLAST computationally challenging but also well suited for parallelization on high-performance computing clusters.
View Article and Find Full Text PDF: The breast pathology quantitative biomarkers (BreastPathQ) challenge was a grand challenge organized jointly by the International Society for Optics and Photonics (SPIE), the American Association of Physicists in Medicine (AAPM), the U.S. National Cancer Institute (NCI), and the U.
View Article and Find Full Text PDFPurpose: Most state-of-the-art automated medical image analysis methods for volumetric data rely on adaptations of two-dimensional (2D) and three-dimensional (3D) convolutional neural networks (CNNs). In this paper, we develop a novel unified CNN-based model that combines the benefits of 2D and 3D networks for analyzing volumetric medical images.
Methods: In our proposed framework, multiscale contextual information is first extracted from 2D slices inside a volume of interest (VOI).