Background: Phenotypes characterize the clinical manifestations of diseases and provide important information for diagnosis. Therefore, the construction of phenotype knowledge graphs for diseases is valuable to the development of artificial intelligence in medicine. However, phenotype knowledge graphs in current knowledge bases such as WikiData and DBpedia are coarse-grained knowledge graphs because they only consider the core concepts of phenotypes while neglecting the details (attributes) associated with these phenotypes.
Objective: To characterize the details of disease phenotypes for clinical guidelines, we proposed a fine-grained semantic information model named PhenoSSU (semantic structured unit of phenotypes).
Methods: PhenoSSU is an "entity-attribute-value" model by its very nature, and it aims to capture the full semantic information underlying phenotype descriptions with a series of attributes and values. A total of 193 clinical guidelines for infectious diseases from Wikipedia were selected as the study corpus, and 12 attributes from SNOMED-CT were introduced into the PhenoSSU model based on the co-occurrences of phenotype concepts and attribute values. The expressive power of the PhenoSSU model was evaluated by analyzing whether PhenoSSU instances could capture the full semantics underlying the descriptions of the corresponding phenotypes. To automatically construct fine-grained phenotype knowledge graphs, a hybrid strategy that first recognized phenotype concepts with the MetaMap tool and then predicted the attribute values of phenotypes with machine learning classifiers was developed.
Results: Fine-grained phenotype knowledge graphs of 193 infectious diseases were manually constructed with the BRAT annotation tool. A total of 4020 PhenoSSU instances were annotated in these knowledge graphs, and 3757 of them (89.5%) were found to be able to capture the full semantics underlying the descriptions of the corresponding phenotypes listed in clinical guidelines. By comparison, other information models, such as the clinical element model and the HL7 fast health care interoperability resource model, could only capture the full semantics underlying 48.4% (2034/4020) and 21.8% (914/4020) of the descriptions of phenotypes listed in clinical guidelines, respectively. The hybrid strategy achieved an F1-score of 0.732 for the subtask of phenotype concept recognition and an average weighted accuracy of 0.776 for the subtask of attribute value prediction.
Conclusions: PhenoSSU is an effective information model for the precise representation of phenotype knowledge for clinical guidelines, and machine learning can be used to improve the efficiency of constructing PhenoSSU-based knowledge graphs. Our work will potentially shift the focus of medical knowledge engineering from a coarse-grained level to a more fine-grained level.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8277235 | PMC |
http://dx.doi.org/10.2196/26892 | DOI Listing |
BMC Bioinformatics
January 2025
School of Computer Science and Technology, University of Science and Technology of China, 443 Huangshan Road, Hefei, 230027, China.
Background: Drug-drug interactions (DDIs) especially antagonistic ones present significant risks to patient safety, underscoring the urgent need for reliable prediction methods. Recently, substructure-based DDI prediction has garnered much attention due to the dominant influence of functional groups and substructures on drug properties. However, existing approaches face challenges regarding the insufficient interpretability of identified substructures and the isolation of chemical substructures.
View Article and Find Full Text PDFJ Biomed Inform
January 2025
Harvard T.H. Chan School of Public Health, 677 Huntington Ave, Boston, 02115, MA, USA; VA Boston Healthcare System, 150 S Huntington Ave, Boston, 02130, MA, USA. Electronic address:
Objective: Electronic health record (EHR) systems contain a wealth of clinical data stored as both codified data and free-text narrative notes (NLP). The complexity of EHR presents challenges in feature representation, information extraction, and uncertainty quantification. To address these challenges, we proposed an efficient Aggregated naRrative Codified Health (ARCH) records analysis to generate a large-scale knowledge graph (KG) for a comprehensive set of EHR codified and narrative features.
View Article and Find Full Text PDFSensors (Basel)
January 2025
Innovation Academy for Microsatellites of Chinese Academy of Sciences, Shanghai 201304, China.
Anomalies frequently occur during the operation of spacecraft in orbit, and studying anomaly detection methods is crucial to ensure the normal operation of spacecraft. Due to the complexity of spacecraft structures, telemetry data possess characteristics such as high dimensionality, complexity, and large scale. Existing methods frequently ignore or fail to explicitly extract the correlation between variables, and due to the lack of prior knowledge, it is difficult to obtain the initial relationship of variables.
View Article and Find Full Text PDFInt J Mol Sci
January 2025
School of Mathematics and Computer Science, Gannan Normal University, Ganzhou 341000, China.
Due to advances in big data technology, deep learning, and knowledge engineering, biological sequence visualization has been extensively explored. In the post-genome era, biological sequence visualization enables the visual representation of both structured and unstructured biological sequence data. However, a universal visualization method for all types of sequences has not been reported.
View Article and Find Full Text PDFBiomolecules
January 2025
School of Artificial Intelligence, Anhui University, Hefei 230601, China.
Interleukin-6 (IL-6) is a potent glycoprotein that plays a crucial role in regulating innate and adaptive immunity, as well as metabolism. The expression and release of IL-6 are closely correlated with the severity of various diseases. IL-6-inducing peptides are critical for the development of immunotherapy and diagnostic biomarkers for some diseases.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!