Publications by authors named "Jan Gewehr"

Background: Clinical routine data derived from university hospitals hold immense value for health-related research on large cohorts. However, using secondary data for hypothesis testing necessitates adherence to scientific, legal (such as the General Data Protection Regulation, federal and state protection legislations), technical, and administrative requirements. This process is intricate, time-consuming, and susceptible to errors.

View Article and Find Full Text PDF

Introduction: Medical research studies which involve electronic data capture of sensitive data about human subjects need to manage medical and identifying participant data in a secure manner. To protect the identity of data subjects, an independent trusted third party should be responsible for pseudonymization and management of the identifying data.

Methods: We have developed a web-based integrated solution that combines REDCap as an electronic data capture system with the trusted third party software tools of the University Medicine Greifswald, which provides study personnel with a single user interface for both clinical data entry and management of identities, pseudonyms and informed consents.

View Article and Find Full Text PDF

Introduction: The acute respiratory distress syndrome (ARDS) is a highly relevant entity in critical care with mortality rates of 40%. Despite extensive scientific efforts, outcome-relevant therapeutic measures are still insufficiently practised at the bedside. Thus, there is a clear need to adhere to early diagnosis and sufficient therapy in ARDS, assuring lower mortality and multiple organ failure.

View Article and Find Full Text PDF

The digitization of health records and cross-institutional data sharing is a necessary precondition to improve clinical research and patient care. The SMITH project unites several university hospitals and medical faculties in order to provide medical informatics solutions for health data integration and cross-institutional communication. In this paper, we focus on requirements elicitation and management for extracting clinical data from heterogeneous subsystems and data integration based on eHealth standards such as HL7 FHIR and IHE profiles.

View Article and Find Full Text PDF

Introduction: This article is part of the Focus Theme of Methods of Information in Medicine on the German Medical Informatics Initiative. "Smart Medical Information Technology for Healthcare (SMITH)" is one of four consortia funded by the German Medical Informatics Initiative (MI-I) to create an alliance of universities, university hospitals, research institutions and IT companies. SMITH's goals are to establish Data Integration Centers (DICs) at each SMITH partner hospital and to implement use cases which demonstrate the usefulness of the approach.

View Article and Find Full Text PDF

In protein research, structural classifications of protein domains provided by databases such as SCOP play an important role. However, as such databases have to be curated and prepared carefully, they update only up to a few times per year, and in between newly entered PDB structures cannot be used in cases where a structural classification is required. The Automated Protein Structure Identification (AutoPSI) database delivers predicted SCOP classifications for several thousand yet unclassified PDB entries as well as millions of UniProt sequences in an automated fashion.

View Article and Find Full Text PDF

Motivation: The sequence patterns contained in the available motif and hidden Markov model (HMM) databases are a valuable source of information for protein sequence annotation. For structure prediction and fold recognition purposes, we computed mappings from such pattern databases to the protein domain hierarchy given by the ASTRAL compendium and applied them to the prediction of SCOP classifications. Our aim is to make highly confident predictions also for non-trivial cases if possible and abstain from a prediction otherwise, and thus to provide a method that can be used as a first step in a pipeline of prediction methods.

View Article and Find Full Text PDF

Unlabelled: Vorolign, a fast and flexible structural alignment method for two or more protein structures is introduced. The method aligns protein structures using double dynamic programming and measures the similarity of two residues based on the evolutionary conservation of their corresponding Voronoi-contacts in the protein structure. This similarity function allows aligning protein structures even in cases where structural flexibilities exist.

View Article and Find Full Text PDF

Unlabelled: Given the growing amount of biological data, data mining methods have become an integral part of bioinformatics research. Unfortunately, standard data mining tools are often not sufficiently equipped for handling raw data such as e.g.

View Article and Find Full Text PDF

Motivation: The prediction of protein domains is a crucial task for functional classification, homology-based structure prediction and structural genomics. In this paper, we present the SSEP-Domain protein domain prediction approach, which is based on the application of secondary structure element alignment (SSEA) and profile-profile alignment (PPA) in combination with InterPro pattern searches. SSEA allows rapid screening for potential domain regions while PPA provides us with the necessary specificity for selecting significant hits.

View Article and Find Full Text PDF

Sequence-structure alignments are a common means for protein structure prediction in the fields of fold recognition and homology modeling, and there is a broad variety of programs that provide such alignments based on sequence similarity, secondary structure or contact potentials. Nevertheless, finding the best sequence-structure alignment in a pool of alignments remains a difficult problem. QUASAR (quality of sequence-structure alignments ranking) provides a unifying framework for scoring sequence-structure alignments that aids finding well-performing combinations of well-known and custom-made scoring schemes.

View Article and Find Full Text PDF

Recognition of protein-DNA binding sites in genomic sequences is a crucial step for discovering biological functions of genomic sequences. Explosive growth in availability of sequence information has resulted in a demand for binding site detection methods with high specificity. The motivation of the work presented here is to address this demand by a systematic approach based on Maximum Likelihood Estimation.

View Article and Find Full Text PDF