SurvdigitizeR: an algorithm for automated survival curve digitization.

BMC Med Res Methodol

Child Health Evaluative Sciences, Peter Gilgan Centre for Research and Learning, The Hospital for Sick Children, Toronto, ON, Canada.

Published: July 2024

Background: Decision analytic models and meta-analyses often rely on survival probabilities that are digitized from published Kaplan-Meier (KM) curves. However, manually extracting these probabilities from KM curves is time-consuming, expensive, and error-prone. We developed an efficient and accurate algorithm that automates extraction of survival probabilities from KM curves.

Methods: The automated digitization algorithm processes images from a JPG or PNG format, converts them in their hue, saturation, and lightness scale and uses optical character recognition to detect axis location and labels. It also uses a k-medoids clustering algorithm to separate multiple overlapping curves on the same figure. To validate performance, we generated survival plots form random time-to-event data from a sample size of 25, 50, 150, and 250, 1000 individuals split into 1,2, or 3 treatment arms. We assumed an exponential distribution and applied random censoring. We compared automated digitization and manual digitization performed by well-trained researchers. We calculated the root mean squared error (RMSE) at 100-time points for both methods. The algorithm's performance was also evaluated by Bland-Altman analysis for the agreement between automated and manual digitization on a real-world set of published KM curves.

Results: The automated digitizer accurately identified survival probabilities over time in the simulated KM curves. The average RMSE for automated digitization was 0.012, while manual digitization had an average RMSE of 0.014. Its performance was negatively correlated with the number of curves in a figure and the presence of censoring markers. In real-world scenarios, automated digitization and manual digitization showed very close agreement.

Conclusions: The algorithm streamlines the digitization process and requires minimal user input. It effectively digitized KM curves in simulated and real-world scenarios, demonstrating accuracy comparable to conventional manual digitization. The algorithm has been developed as an open-source R package and as a Shiny application and is available on GitHub: https://github.com/Pechli-Lab/SurvdigitizeR and https://pechlilab.shinyapps.io/SurvdigitizeR/ .

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11245803PMC
http://dx.doi.org/10.1186/s12874-024-02273-8DOI Listing

Publication Analysis

Top Keywords

manual digitization
20
automated digitization
12
digitization
10
survival probabilities
8
digitization algorithm
8
curves figure
8
average rmse
8
real-world scenarios
8
automated
7
curves
6

Similar Publications

Investigating muscle architecture in static and dynamic conditions is essential to understand muscle function and muscle adaptations. Muscle architecture analysis, primarily through extended field-of-view ultrasound imaging, offers high reliability at rest but faces limitations during dynamic conditions. Traditional methods often involve "best fitting" straight lines to track muscle fascicles, leading to possible errors, especially with longer fascicles or those with nonlinear paths.

View Article and Find Full Text PDF

The marginal wells in low-permeability oil fields are characterized by small storage size, scattered distribution, intermittent production, etc. The construction of large-scale gathering pipelines has large investment. So the current production mode is featured by single well tank oil storage, oil tank truck transportation and manual tank truck scheduling.

View Article and Find Full Text PDF

Conventional scanned optical coherence tomography (OCT) suffers from the frame rate/resolution tradeoff, whereby increasing image resolution leads to decreases in the maximum achievable frame rate. To overcome this limitation, we propose two variants of machine learning (ML)-based adaptive scanning approaches: one using a ConvLSTM-based sequential prediction model and another leveraging a temporal attention unit (TAU)-based parallel prediction model for scene dynamics prediction. These models are integrated with a kinodynamic path planner based on the clustered traveling salesperson problem to create two versions of ML-based adaptive scanning pipelines.

View Article and Find Full Text PDF

Automated polynomial formal verification using generalized binary decision diagram patterns.

Philos Trans A Math Phys Eng Sci

January 2025

Institute of Computer Science, University of Bremen, Bremen, Germany.

With the ongoing digitization, digital circuits have become increasingly present in everyday life. However, as circuits can be faulty, their verification poses a challenging but essential challenge. In contrast to formal verification techniques, simulation techniques fail to fully guarantee the correctness of a circuit.

View Article and Find Full Text PDF

Purpose: The spot size of scanned particle beams is of crucial importance for the correct dose delivery and, therefore, plays a significant role in the quality assurance (QA) of pencil beam scanning ion beam therapy.

Materials And Methods: This study compares 5 detector types-radiochromic film, ionization chamber (IC) array, flat panel detector, multiwire chamber, and IC-for measuring the spot size of proton and carbon ion beams.

Results: Variations of up to 30% were found between detectors, underscoring the impact of detector choice on QA outcomes.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!