Background: Implementation of phenotype algorithms requires phenotype engineers to interpret human-readable algorithms and translate the description (text and flowcharts) into computable phenotypes - a process that can be labor intensive and error prone. To address the critical need for reducing the implementation efforts, it is important to develop portable algorithms.
Methods: We conducted a retrospective analysis of phenotype algorithms developed in the Electronic Medical Records and Genomics (eMERGE) network and identified common customization tasks required for implementation. A novel scoring system was developed to quantify portability from three aspects: Knowledge conversion, clause Interpretation, and Programming (KIP). Tasks were grouped into twenty representative categories. Experienced phenotype engineers were asked to estimate the average time spent on each category and evaluate time saving enabled by a common data model (CDM), specifically the Observational Medical Outcomes Partnership (OMOP) model, for each category.
Results: A total of 485 distinct clauses (phenotype criteria) were identified from 55 phenotype algorithms, corresponding to 1153 customization tasks. In addition to 25 non-phenotype-specific tasks, 46 tasks are related to interpretation, 613 tasks are related to knowledge conversion, and 469 tasks are related to programming. A score between 0 and 2 (0 for easy, 1 for moderate, and 2 for difficult portability) is assigned for each aspect, yielding a total KIP score range of 0 to 6. The average clause-wise KIP score to reflect portability is 1.37 ± 1.38. Specifically, the average knowledge (K) score is 0.64 ± 0.66, interpretation (I) score is 0.33 ± 0.55, and programming (P) score is 0.40 ± 0.64. 5% of the categories can be completed within one hour (median). 70% of the categories take from days to months to complete. The OMOP model can assist with vocabulary mapping tasks.
Conclusion: This study presents firsthand knowledge of the substantial implementation efforts in phenotyping and introduces a novel metric (KIP) to measure portability of phenotype algorithms for quantifying such efforts across the eMERGE Network. Phenotype developers are encouraged to analyze and optimize the portability in regards to knowledge, interpretation and programming. CDMs can be used to improve the portability for some 'knowledge-oriented' tasks.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6894517 | PMC |
http://dx.doi.org/10.1016/j.jbi.2019.103293 | DOI Listing |
Alzheimers Dement
January 2025
Barcelonaβeta Brain Research Center (BBRC), Pasqual Maragall Foundation, Barcelona, Spain.
Introduction: Traditional multivariate methods for neuroimaging studies overlook the interdependent relationship between brain features. This study addresses this gap by analyzing relative brain volumetric patterns to capture how Alzheimer's disease (AD) and genetics influence brain structure along the disease continuum.
Methods: This study analyzed data from participants across the AD continuum from the Alzheimer's and Families (ALFA) and Alzheimer's Disease Neuroimaging Initiative (ADNI) studies.
Probl Endokrinol (Mosk)
January 2024
Background: Osteoporosis is a common age-related disease with disabling consequences, the early diagnosis of which is difficult due to its long and hidden course, which often leads to diagnosis only after a fracture. In this regard, great expectations are placed on advanced developments in machine learning technologies aimed at predicting osteoporosis at an early stage of development, including the use of large data sets containing information on genetic and clinical predictors of the disease. Nevertheless, the inclusion of DNA markers in prediction models is fraught with a number of difficulties due to the complex polygenic and heterogeneous nature of the disease.
View Article and Find Full Text PDFSingle-cell RNA-seq analysis characterizes developmental mechanisms of cellular differentiation, lineage determination, and reprogramming with differential conditioning of the microenvironment. In this article, the underlying dynamics are formulated via optimal transport with algorithms that calculate the transition probability of the state of cell dynamics over time. The algorithmic biases of optimal transport (OT) due to entropic regularization are balanced by Sinkhorn divergence, which normally de-biases the regularized transport by centering them.
View Article and Find Full Text PDFFront Neurosci
January 2025
Neurology Associate P.C., Lincoln, NE, United States.
Introduction: As a hallmark feature of amyotrophic lateral sclerosis (ALS), bulbar involvement significantly impacts psychosocial, emotional, and physical health. A validated objective marker is however lacking to characterize and phenotype bulbar involvement, positing a major barrier to early detection, progress monitoring, and tailored care. This study aimed to bridge this gap by constructing a multiplex functional mandibular muscle network to provide a novel objective measurement tool of bulbar involvement.
View Article and Find Full Text PDFThe G2PDeep-v2 server is a web-based platform powered by deep learning, for phenotype prediction and markers discovery from multi-omics data in any organisms including humans, plants, animals, and viruses. The server provides multiple services for researchers to create deep-learning models through an interactive interface and train these models using an automated hyperparameter tuning algorithm on high-performance computing resources. Users can visualize the results of phenotype and markers predictions and perform Gene Set Enrichment Analysis for the significant markers to provide insights into the molecular mechanisms underlying complex diseases, conditions and other biological phenotypes being studied.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!