Background: Two common issues may arise in certain population-based breast cancer (BC) survival studies: I) missing values in a survivals' predictive variable, such as "Stage" at diagnosis, and II) small sample size due to "imbalance class problem" in certain subsets of patients, demanding data modeling/simulation methods.

Methods: We present a procedure, ModGraProDep, based on graphical modeling (GM) of a dataset to overcome these two issues. The performance of the models derived from ModGraProDep is compared with a set of frequently used classification and machine learning algorithms (Missing Data Problem) and with oversampling algorithms (Synthetic Data Simulation). For the Missing Data Problem we assessed two scenarios: missing completely at random (MCAR) and missing not at random (MNAR). Two validated BC datasets provided by the cancer registries of Girona and Tarragona (northeastern Spain) were used.

Results: In both MCAR and MNAR scenarios all models showed poorer prediction performance compared to three GM models: the saturated one (GM.SAT) and two with penalty factors on the partial likelihood (GM.K1 and GM.TEST). However, GM.SAT predictions could lead to non-reliable conclusions in BC survival analysis. Simulation of a "synthetic" dataset derived from GM.SAT could be the worst strategy, but the use of the remaining GMs models could be better than oversampling.

Conclusion: Our results suggest the use of the GM-procedure presented for one-variable imputation/prediction of missing data and for simulating "synthetic" BC survival datasets. The "synthetic" datasets derived from GMs could be also used in clinical applications of cancer survival data such as predictive risk analysis.

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.artmed.2020.101875DOI Listing

Publication Analysis

Top Keywords

missing data
16
cancer survival
12
synthetic data
8
data simulation
8
breast cancer
8
data problem
8
missing
7
data
7
survival
5
data imputation
4

Similar Publications

Objective: Discussions related to the importance of seeking specific consent for sensitive (e.g., pelvic, rectal) exams performed on anesthetized patients by medical students have been growing.

View Article and Find Full Text PDF

As the global economy expands, waterway transportation has become increasingly crucial to the logistics sector. This growth presents both significant challenges and opportunities for enhancing the accuracy of ship detection and tracking through the application of artificial intelligence. This article introduces a multi-object tracking system designed for unmanned aerial vehicles (UAVs), utilizing the YOLOv7 and Deep SORT algorithms for detection and tracking, respectively.

View Article and Find Full Text PDF

Background And Objective: Sickle cell disease (SCD) is a genetically inherited disorder that is associated with morbidity and mortality.

Methods: This cross-sectional study was conducted on patients diagnosed with SCD to evaluate the knowledge, attitude, and practice of patients/guardians using a pretested questionnaire.

Results And Discussion: Of the 111 participants, 56 (50.

View Article and Find Full Text PDF

Prcis: Guardian education level and frequency of surgical interventions are key determinants of knowledge in primary congenital glaucoma, highlighting the need for targeted educational strategies.

Background: Management of congenital glaucoma poses unique challenges, particularly concerning the patient guardians' understanding of the condition, which is crucial for treatment adherence and follow-up compliance. This study aimed to assess guardians' knowledge levels and identify the influencing factors.

View Article and Find Full Text PDF

Unipept, a pioneering software tool in metaproteomics, has significantly advanced the analysis of complex ecosystems by facilitating both taxonomic and functional insights from environmental samples. From the onset, Unipept's capabilities focused on tryptic peptides, utilizing the predictability and consistency of trypsin digestion to efficiently construct a protein reference database. However, the evolving landscape of proteomics and emerging fields like immunopeptidomics necessitate a more versatile approach that extends beyond the analysis of tryptic peptides.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!