Publications by Berkman Sahiner | LitMetric

Publications by authors named "Berkman Sahiner"

Page 1 of 6

A data-driven framework for identifying patient subgroups on which an AI/machine learning model may underperform.

Adarsh Subbaswamy Berkman Sahiner Nicholas Petrick Vinay Pai Roy Adams

NPJ Digit Med

November 2024

A fundamental goal of evaluating the performance of a clinical model is to ensure it performs well across a diverse intended patient population. A primary challenge is that the data used in model development and testing often consist of many overlapping, heterogeneous patient subgroups that may not be explicitly defined or labeled. While a model's average performance on a dataset may be high, the model can have significantly lower performance for certain subgroups, which may be hard to detect.

View Article and Find Full Text PDF

Bias amplification to facilitate the systematic evaluation of bias mitigation methods.

Alexis Burgon Yuhang Zhang Nicholas Petrick Berkman Sahiner Kenny H Cha

IEEE J Biomed Health Inform

November 2024

The future of artificial intelligence (AI) safety is expected to include bias mitigation methods from development to application. The complexity and integration of these methods could grow in conjunction with advances in AI and human-AI interactions. Numerous methods are being proposed to mitigate bias, but without a structured way to compare their strengths and weaknesses.

View Article and Find Full Text PDF

Applying queueing theory to evaluate wait-time-savings of triage algorithms.

Yee Lam Elim Thompson Gary M Levine Weijie Chen Berkman Sahiner Qin Li

Queueing Syst

September 2024

In the past decade, artificial intelligence (AI) algorithms have made promising impacts in many areas of healthcare. One application is AI-enabled prioritization software known as computer-aided triage and notification (CADt). This type of software as a medical device is intended to prioritize reviews of radiological images with time-sensitive findings, thus shortening the waiting time for patients with these findings.

View Article and Find Full Text PDF

Out-of-Distribution Detection and Radiological Data Monitoring Using Statistical Process Control.

Ghada Zamzmi Kesavan Venkatesh Brandon Nelson Smriti Prathapan Paul Yi Berkman Sahiner

J Imaging Inform Med

September 2024

Machine learning (ML) models often fail with data that deviates from their training distribution. This is a significant concern for ML-enabled devices as data drift may lead to unexpected performance. This work introduces a new framework for out of distribution (OOD) detection and data drift monitoring that combines ML and geometric methods with statistical process control (SPC).

View Article and Find Full Text PDF

Exploring synthetic datasets for computer-aided detection: a case study using phantom scan data for enhanced lung nodule false positive reduction.

Mohammad Mehdi Farhangi Michael Maynord Cornelia Fermüller Yiannis Aloimonos Berkman Sahiner

J Med Imaging (Bellingham)

July 2024

Purpose: Synthetic datasets hold the potential to offer cost-effective alternatives to clinical data, ensuring privacy protections and potentially addressing biases in clinical data. We present a method leveraging such datasets to train a machine learning algorithm applied as part of a computer-aided detection (CADe) system.

Approach: Our proposed approach utilizes clinically acquired computed tomography (CT) scans of a physical anthropomorphic phantom into which manufactured lesions were inserted to train a machine learning algorithm.

View Article and Find Full Text PDF

A comparison of Bayesian and score methods for interval estimates of positive/negative likelihood ratios in support of diagnostic device performance evaluation.

Tingting Hu Berkman Sahiner Nicholas Petrick Kenny Cha Si Wen

J Biopharm Stat

June 2024

Background: Positive and negative likelihood ratios (PLR and NLR) are important metrics of accuracy for diagnostic devices with a binary output. However, the properties of Bayesian and frequentist interval estimators of PLR/NLR have not been extensively studied and compared. In this study, we explore the potential use of the Bayesian method for interval estimation of PLR/NLR, and, more broadly, for interval estimation of the ratio of two independent proportions.

View Article and Find Full Text PDF

Resource requirements to accelerate clinical applications of next-generation sequencing and radiomics: workshop commentary and review.

Lyndsay Harris Lalitha K Shankar Claire Hildebrandt Wendy S Rubinstein Kristofor Langlais Berkman Sahiner

J Natl Cancer Inst

October 2024

The National Institutes of Health-US Food and Drug Administration Joint Leadership Council Next-Generation Sequencing and Radiomics Working Group was formed by the National Institutes of Health-Food and Drug Administration Joint Leadership Council to promote the development and validation of innovative next-generation sequencing tests, radiomic tools, and associated data analysis and interpretation enhanced by artificial intelligence and machine learning technologies. A 2-day workshop was held on September 29-30, 2021, to convene members of the scientific community to discuss how to overcome the "ground truth" gap that has frequently been acknowledged as 1 of the limiting factors impeding high-quality research, development, validation, and regulatory science in these fields. This report provides a summary of the resource gaps identified by the working group and attendees, highlights existing resources and the ways they can potentially be employed to accelerate growth in these fields, and presents opportunities to support next-generation sequencing and radiomic tool development and validation using technologies such as artificial intelligence and machine learning.

View Article and Find Full Text PDF

AI and machine learning in medical imaging: key points from development to translation.

Ravi K Samala Karen Drukker Amita Shukla-Dave Heang-Ping Chan Berkman Sahiner

BJR Artif Intell

January 2024

Article Synopsis

Innovation in medical imaging using AI and machine learning requires thorough data collection and algorithm improvements, along with careful evaluation of factors like bias and trustworthiness.
Successfully integrating AI/ML into clinical settings is challenging and hinges on addressing issues in model design, development, regulatory compliance, and stakeholder collaboration.
Tackling these complexities is essential not only for overcoming current obstacles but also for unlocking new opportunities in the field of radiology.

View Article and Find Full Text PDF

MIDRC-MetricTree: a decision tree-based tool for recommending performance metrics in artificial intelligence-assisted medical image analysis.

Karen Drukker Berkman Sahiner Tingting Hu Grace Hyun Kim Heather M Whitney

J Med Imaging (Bellingham)

March 2024

Purpose: The Medical Imaging and Data Resource Center (MIDRC) was created to facilitate medical imaging machine learning (ML) research for tasks including early detection, diagnosis, prognosis, and assessment of treatment response related to the coronavirus disease 2019 pandemic and beyond. The purpose of this work was to create a publicly available metrology resource to assist researchers in evaluating the performance of their medical image analysis ML algorithms.

Approach: An interactive decision tree, called MIDRC-MetricTree, has been developed, organized by the type of task that the ML algorithm was trained to perform.

View Article and Find Full Text PDF

Artificial intelligence in medicine: mitigating risks and maximizing benefits via quality assurance, quality control, and acceptance testing.

Usman Mahmood Amita Shukla-Dave Heang-Ping Chan Karen Drukker Ravi K Samala Berkman Sahiner

BJR Artif Intell

January 2024

The adoption of artificial intelligence (AI) tools in medicine poses challenges to existing clinical workflows. This commentary discusses the necessity of context-specific quality assurance (QA), emphasizing the need for robust QA measures with quality control (QC) procedures that encompass (1) acceptance testing (AT) before clinical use, (2) continuous QC monitoring, and (3) adequate user training. The discussion also covers essential components of AT and QA, illustrated with real-world examples.

View Article and Find Full Text PDF

Decision region analysis for generalizability of artificial intelligence models: estimating model generalizability in the case of cross-reactivity and population shift.

Alexis Burgon Berkman Sahiner Nicholas Petrick Gene Pennello Kenny H Cha

J Med Imaging (Bellingham)

January 2024

Purpose: Understanding an artificial intelligence (AI) model's ability to generalize to its target population is critical to ensuring the safe and effective usage of AI in medical devices. A traditional generalizability assessment relies on the availability of large, diverse datasets, which are difficult to obtain in many medical imaging applications. We present an approach for enhanced generalizability assessment by examining the decision space beyond the available testing data distribution.

View Article and Find Full Text PDF

AFE-GAN: Synthesizing Electrocardiograms with Atrial Fibrillation Characteristics Using Generative Adversarial Networks.

Xianglong Wang Berkman Sahiner Christopher G Scully Kenny H Cha

Annu Int Conf IEEE Eng Med Biol Soc

July 2023

Labeled ECG data in diseased state are, however, relatively scarce due to various concerns including patient privacy and low prevalence. We propose the first study in its kind that synthesizes atrial fibrillation (AF)-like ECG signals from normal ECG signals using the AFE-GAN, a generative adversarial network. Our AFE-GAN adjusts both beat morphology and rhythm variability when generating the atrial fibrillation-like ECG signals.

View Article and Find Full Text PDF

Sequestration of imaging studies in MIDRC: stratified sampling to balance demographic characteristics of patients in a multi-institutional data commons.

Natalie Baughan Heather M Whitney Karen Drukker Berkman Sahiner Tingting Hu

J Med Imaging (Bellingham)

November 2023

Purpose: The Medical Imaging and Data Resource Center (MIDRC) is a multi-institutional effort to accelerate medical imaging machine intelligence research and create a publicly available image repository/commons as well as a sequestered commons for performance evaluation and benchmarking of algorithms. After de-identification, approximately 80% of the medical images and associated metadata become part of the open commons and 20% are sequestered from the open commons. To ensure that both commons are representative of the population available, we introduced a stratified sampling method to balance the demographic characteristics across the two datasets.

View Article and Find Full Text PDF

Methodology for Good Machine Learning with Multi-Omics Data.

Thibaud Coroller Berkman Sahiner Anup Amatya Alexej Gossmann Konstantinos Karagiannis

Clin Pharmacol Ther

April 2024

Article Synopsis

In 2020, Novartis and the FDA began a 4-year collaboration to explore radio-genomics for predicting factors in HR+/HER- metastatic breast cancer.
The partnership focuses on harnessing advanced analytics and AI to improve future scientific projects.
The tutorial offers guidelines for conducting multi-omics research, emphasizing communication, data practices, and outlining a four-step process: plan, design, develop, and disseminate.

View Article and Find Full Text PDF

Longitudinal assessment of demographic representativeness in the Medical Imaging and Data Resource Center open data commons.

Heather M Whitney Natalie Baughan Kyle J Myers Karen Drukker Judy Gichoya Berkman Sahiner

J Med Imaging (Bellingham)

November 2023

Purpose: The Medical Imaging and Data Resource Center (MIDRC) open data commons was launched to accelerate the development of artificial intelligence (AI) algorithms to help address the COVID-19 pandemic. The purpose of this study was to quantify longitudinal representativeness of the demographic characteristics of the primary MIDRC dataset compared to the United States general population (US Census) and COVID-19 positive case counts from the Centers for Disease Control and Prevention (CDC).

Approach: The Jensen-Shannon distance (JSD), a measure of similarity of two distributions, was used to longitudinally measure the representativeness of the distribution of (1) all unique patients in the MIDRC data to the 2020 US Census and (2) all unique COVID-19 positive patients in the MIDRC data to the case counts reported by the CDC.

View Article and Find Full Text PDF

Regulatory considerations for medical imaging AI/ML devices in the United States: concepts and challenges.

Nicholas Petrick Weijie Chen Jana G Delfino Brandon D Gallas Yanna Kang Berkman Sahiner

J Med Imaging (Bellingham)

September 2023

Purpose: To introduce developers to medical device regulatory processes and data considerations in artificial intelligence and machine learning (AI/ML) device submissions and to discuss ongoing AI/ML-related regulatory challenges and activities.

Approach: AI/ML technologies are being used in an increasing number of medical imaging devices, and the fast evolution of these technologies presents novel regulatory challenges. We provide AI/ML developers with an introduction to U.

View Article and Find Full Text PDF

Toward fairness in artificial intelligence for medical image analysis: identification and mitigation of potential biases in the roadmap from data collection to model deployment.

Karen Drukker Weijie Chen Judy Gichoya Nicholas Gruszauskas Jayashree Kalpathy-Cramer Berkman Sahiner

J Med Imaging (Bellingham)

November 2023

Purpose: To recognize and address various sources of bias essential for algorithmic fairness and trustworthiness and to contribute to a just and equitable deployment of AI in medical imaging, there is an increasing interest in developing medical imaging-based machine learning methods, also known as medical imaging artificial intelligence (AI), for the detection, diagnosis, prognosis, and risk assessment of disease with the goal of clinical implementation. These tools are intended to help improve traditional human decision-making in medical imaging. However, biases introduced in the steps toward clinical deployment may impede their intended function, potentially exacerbating inequities.

View Article and Find Full Text PDF

Data drift in medical machine learning: implications and potential remedies.

Berkman Sahiner Weijie Chen Ravi K Samala Nicholas Petrick

Br J Radiol

October 2023

Data drift refers to differences between the data used in training a machine learning (ML) model and that applied to the model in real-world operation. Medical ML systems can be exposed to various forms of data drift, including differences between the data sampled for training and used in clinical operation, differences between medical practices or context of use between training and clinical use, and time-related changes in patient populations, disease patterns, and data acquisition, to name a few. In this article, we first review the terminology used in ML literature related to data drift, define distinct types of drift, and discuss in detail potential causes within the context of medical applications with an emphasis on medical imaging.

View Article and Find Full Text PDF

Semi-supervised training using cooperative labeling of weakly annotated data for nodule detection in chest CT.

Michael Maynord M Mehdi Farhangi Cornelia Fermüller Yiannis Aloimonos Gary Levine Berkman Sahiner

Med Phys

July 2023

Purpose: Machine learning algorithms are best trained with large quantities of accurately annotated samples. While natural scene images can often be labeled relatively cheaply and at large scale, obtaining accurate annotations for medical images is both time consuming and expensive. In this study, we propose a cooperative labeling method that allows us to make use of weakly annotated medical imaging data for the training of a machine learning algorithm.

View Article and Find Full Text PDF

AAPM task group report 273: Recommendations on best practices for AI and machine learning for computer-aided diagnosis in medical imaging.

Lubomir Hadjiiski Kenny Cha Heang-Ping Chan Karen Drukker Lia Morra Berkman Sahiner

Med Phys

February 2023

Rapid advances in artificial intelligence (AI) and machine learning, and specifically in deep learning (DL) techniques, have enabled broad application of these methods in health care. The promise of the DL approach has spurred further interest in computer-aided diagnosis (CAD) development and applications using both "traditional" machine learning methods and newer DL-based methods. We use the term CAD-AI to refer to this expanded clinical decision support environment that uses traditional and DL-based AI methods.

View Article and Find Full Text PDF

Bayesian logistic regression for online recalibration and revision of risk prediction models with performance guarantees.

Jean Feng Alexej Gossmann Berkman Sahiner Romain Pirracchio

J Am Med Inform Assoc

April 2022

Objective: After deploying a clinical prediction model, subsequently collected data can be used to fine-tune its predictions and adapt to temporal shifts. Because model updating carries risks of over-updating/fitting, we study online methods with performance guarantees.

Materials And Methods: We introduce 2 procedures for continual recalibration or revision of an underlying prediction model: Bayesian logistic regression (BLR) and a Markov variant that explicitly models distribution shifts (MarBLR).

View Article and Find Full Text PDF

AI in medical physics: guidelines for publication.

Issam El Naqa John M Boone Stanley H Benedict Mitchell M Goodsitt Heang-Ping Chan Berkman Sahiner

Med Phys

September 2021

The Abstract is intended to provide a concise summary of the study and its scientific findings. For AI/ML applications in medical physics, a problem statement and rationale for utilizing these algorithms are necessary while highlighting the novelty of the approach. A brief numerical description of how the data are partitioned into subsets for training of the AI/ML algorithm, validation (including tuning of parameters), and independent testing of algorithm performance is required.

View Article and Find Full Text PDF

SPIE-AAPM-NCI BreastPathQ challenge: an image analysis challenge for quantitative tumor cellularity assessment in breast cancer histology images following neoadjuvant treatment.

Nicholas Petrick Shazia Akbar Kenny H Cha Sharon Nofech-Mozes Berkman Sahiner

J Med Imaging (Bellingham)

May 2021

: The breast pathology quantitative biomarkers (BreastPathQ) challenge was a grand challenge organized jointly by the International Society for Optics and Photonics (SPIE), the American Association of Physicists in Medicine (AAPM), the U.S. National Cancer Institute (NCI), and the U.

View Article and Find Full Text PDF

Automatic lung nodule detection in thoracic CT scans using dilated slice-wise convolutions.

M Mehdi Farhangi Berkman Sahiner Nicholas Petrick Aria Pezeshk

Med Phys

July 2021

Purpose: Most state-of-the-art automated medical image analysis methods for volumetric data rely on adaptations of two-dimensional (2D) and three-dimensional (3D) convolutional neural networks (CNNs). In this paper, we develop a novel unified CNN-based model that combines the benefits of 2D and 3D networks for analyzing volumetric medical images.

Methods: In our proposed framework, multiscale contextual information is first extracted from 2D slices inside a volume of interest (VOI).

View Article and Find Full Text PDF