Publications by authors named "Rainer Winnenburg"

Biopharmaceutical industry R&D, and indeed other life sciences R&D such as biomedical, environmental, agricultural and food production, is becoming increasingly data-driven and can significantly improve its efficiency and effectiveness by implementing the FAIR (findable, accessible, interoperable, reusable) guiding principles for scientific data management and stewardship. By so doing, the plethora of new and powerful analytical tools such as artificial intelligence and machine learning will be able, automatically and at scale, to access the data from which they learn, and on which they thrive. FAIR is a fundamental enabler for digital transformation.

View Article and Find Full Text PDF

Gene Ontology (GO) enrichment analysis is ubiquitously used for interpreting high throughput molecular data and generating hypotheses about underlying biological phenomena of experiments. However, the two building blocks of this analysis - the ontology and the annotations - evolve rapidly. We used gene signatures derived from 104 disease analyses to systematically evaluate how enrichment analysis results were affected by evolution of the GO over a decade.

View Article and Find Full Text PDF

Measuring the usage of informatics resources such as software tools and databases is essential to quantifying their impact, value and return on investment. We have developed a publicly available dataset of informatics resource publications and their citation network, along with an associated metric (u-Index) to measure informatics resources' impact over time. Our dataset differentiates the context in which citations occur to distinguish between 'awareness' and 'usage', and uses a citing universe of open access publications to derive citation counts for quantifying impact.

View Article and Find Full Text PDF

Interoperability among medication classification systems is known to be limited. We investigated the mapping of the Established Pharmacologic Classes (EPCs) to SNOMED CT. We compared lexical and instance-based methods to an expert-reviewed reference standard to evaluate contributions of these methods.

View Article and Find Full Text PDF

Value sets (VSs) used in electronic clinical quality measures are lists of codes from standard terminologies ("extensional" VSs), whose purpose ("intension") is not always explicitly stated. We elicited the intension for the 09/01/2014 release of extensional medication value sets by comparison to drug classes from the October 2014 release of RxClass. Value sets matched drug classes if they shared common ingredients, as evidenced by Jaccard similarity score.

View Article and Find Full Text PDF

Objective: Improving mechanisms to detect adverse drug reactions (ADRs) is key to strengthening post-marketing drug safety surveillance. Signal detection is presently unimodal, relying on a single information source. Multimodal signal detection is based on jointly analyzing multiple information sources.

View Article and Find Full Text PDF

Drug design is expensive, time-consuming and becoming increasingly complicated. Computational approaches for inferring potentially new purposes of existing drugs, referred to as drug repositioning, play an increasingly important part in current pharmaceutical studies. Here, we first summarize recent developments in computational drug repositioning and introduce the utilized data sources.

View Article and Find Full Text PDF

Background: Identification of associations between marketed drugs and adverse events from the biomedical literature assists drug safety monitoring efforts. Assessing the significance of such literature-derived associations and determining the granularity at which they should be captured remains a challenge. Here, we assess how defining a selection of adverse event terms from MeSH, based on information content, can improve the detection of adverse events for drugs and drug classes.

View Article and Find Full Text PDF

Background And Objective: Several studies have demonstrated the ability to detect adverse events potentially related to multiple drug exposure via data mining. However, the number of putative associations produced by such computational approaches is typically large, making experimental validation difficult. We theorized that those potential associations for which there is evidence from multiple complementary sources are more likely to be true, and explored this idea using a published database of drug-drug-adverse event associations derived from electronic health records (EHRs).

View Article and Find Full Text PDF

Phenotypes have gained increased notoriety in the clinical and biological domain owing to their application in numerous areas such as the discovery of disease genes and drug targets, phylogenetics and pharmacogenomics. Phenotypes, defined as observable characteristics of organisms, can be seen as one of the bridges that lead to a translation of experimental findings into clinical applications and thereby support 'bench to bedside' efforts. However, to build this translational bridge, a common and universal understanding of phenotypes is required that goes beyond domain-specific definitions.

View Article and Find Full Text PDF

Background: Traditional approaches to pharmacovigilance center on the signal detection from spontaneous reports, e.g., the U.

View Article and Find Full Text PDF

Objectives: To extend the coverage of phenotypes in SNOMED CT through post-coordination.

Methods: We identify frequent modifiers in terms from the Human Phenotype Ontology (HPO), which we associate with templates for post-coordinated expressions in SNOMED CT.

Results: We identified 176 modifiers, created 12 templates, and generated 1,617 post-coordinated expressions.

View Article and Find Full Text PDF

Objective: Adverse drug events (ADEs) are undesired harmful effects resulting from use of a medication, and occur in 30% of hospitalized patients. The authors have developed a data-mining method for systematic, automated detection of ADEs from electronic medical records.

Materials And Methods: This method uses the text from 9.

View Article and Find Full Text PDF

Background: While the association between a drug and an adverse event (ADE) is generally detected at the level of individual drugs, ADEs are often discussed at the class level, i.e., at the level of pharmacologic classes (e.

View Article and Find Full Text PDF

Background: The NDF-RT (National Drug File Reference Terminology) is an ontology, which describes drugs and their properties and supports computerized physician order entry systems. NDF-RT's classes are mostly specified using only necessary conditions and lack sufficient conditions, making its use limited until recently, when asserted drug-class relations were added. The addition of these asserted drug-class relations presents an opportunity to compare them with drug-class relations that can be inferred using the properties of drugs and drug classes in NDF-RT.

View Article and Find Full Text PDF

Undetected adverse drug reactions (ADRs) pose a major burden on the health system. Data mining methodologies designed to identify signals of novel ADRs are of deep importance for drug safety surveillance. The development and evaluation of these methodologies requires proper reference benchmarks.

View Article and Find Full Text PDF

Background: The objective of this study is to develop a framework for assessing the consistency of drug classes across sources, such as MeSH and ATC. Our framework integrates and contrasts lexical and instance-based ontology alignment techniques. Moreover, we propose metrics for assessing not only equivalence relations, but also inclusion relations among drug classes.

View Article and Find Full Text PDF

Objective: To assess the quality of value sets in clinical quality measures, both individually and as a population of value sets.

Materials And Methods: The concepts from a given value set are expected to be rooted by one or few ancestor concepts and the value set is expected to contain all the descendants of its root concepts and only these descendants. (1) We assessed the completeness and correctness of individual value sets by comparison to the extension derived from their roots.

View Article and Find Full Text PDF

The Semantic Web provides a framework for the integration of resources on the web, which facilitates information integration and interoperability. RDF is the main representation format for Linked Open Data (LOD). However, datasets are not always made available in RDF by their producers and the Semantic Web community has had to convert some of these datasets to RDF in order for these datasets to participate in the LOD cloud.

View Article and Find Full Text PDF

The Value Set Authority Center (VSAC) at the National Library of Medicine (NLM) provides downloadable access to all official versions of vocabulary value sets contained in the Clinical Quality Measures (CQMs) used in the certification criteria for electronic health record systems ("Meaningful Use" incentive program). Each value set consists of the numerical values (codes) and human-readable names (descriptions), drawn from standard vocabularies such as LOINC, RxNorm and SNOMED CT®, that are used to define clinical data elements used in clinical quality measures (e.g.

View Article and Find Full Text PDF

Objectives: To investigate the extent to which pharmacoepidemiologic groupings are homogeneous in terms of clinical properties.

Methods: In our analysis, we classified drug subgroups from the pharmacoepidemiologic Anatomical Therapeutic Chemical (ATC) classification system based on clinical drug properties. We established mappings from ATC fifth level drug entities to drug property annotations in the National Drug File Reference Terminology (NDF-RT), including therapeutic categories, mechanisms of action, and physiologic effects.

View Article and Find Full Text PDF

Objective: To develop methods for assessing the validity, consistency and currency of value sets for clinical quality measures, in order to support the developers of quality measures in which such value sets are used.

Methods: We assessed the well-formedness of the codes (in a given code system), the existence and currency of the codes in the corresponding code system, using the UMLS and RxNorm terminology services. We also investigated the overlap among value sets using the Jaccard similarity measure.

View Article and Find Full Text PDF

Much of the information on the Cytochrome P450 enzymes (CYPs) is spread across literature and the internet. Aggregating knowledge about CYPs into one database makes the search more efficient. Text mining on 57 CYPs and drugs led to a mass of papers, which were screened manually for facts about metabolism, SNPs and their effects on drug degradation.

View Article and Find Full Text PDF

Membrane proteins are important for many processes in the cell and used as main drug targets. The increasing number of high-resolution structures available makes for the first time a characterization of local structural and functional motifs in alpha-helical transmembrane proteins possible. MeMotif (http://projects.

View Article and Find Full Text PDF

Background: The automated retrieval and integration of information about protein point mutations in combination with structure, domain and interaction data from literature and databases promises to be a valuable approach to study structure-function relationships in biomedical data sets.

Results: We developed a rule- and regular expression-based protein point mutation retrieval pipeline for PubMed abstracts, which shows an F-measure of 87% for the mutation retrieval task on a benchmark dataset. In order to link mutations to their proteins, we utilize a named entity recognition algorithm for the identification of gene names co-occurring in the abstract, and establish links based on sequence checks.

View Article and Find Full Text PDF