Publications by authors named "Raisaro J"

Predictive modeling holds a large potential in clinical decision-making, yet its effectiveness can be hindered by inherent data imbalances in clinical datasets. This study investigates the utility of synthetic data for improving the performance of predictive modeling on realistic small imbalanced clinical datasets. We compared various synthetic data generation methods including Generative Adversarial Networks, Normalizing Flows, and Variational Autoencoders to the standard baselines for correcting for class underrepresentation on four clinical datasets.

View Article and Find Full Text PDF

Clinical notes contain valuable information for research and monitoring quality of care. Named Entity Recognition (NER) is the process for identifying relevant pieces of information such as diagnoses, treatments, side effects, etc., and bring them to a more structured form.

View Article and Find Full Text PDF

Generative machine learning models such as Generative Adversarial Networks (GANs) have been shown to be especially successful in generating realistic synthetic data in image and tabular domains. However, it has been shown that such generative models, as well as the generated synthetic data, can reveal information contained in their privacy-sensitive training data, and therefore must be carefully evaluated before being used. The gold standard method through which such privacy leakage can be estimated is simulating membership inference attacks (MIAs), in which an attacker attempts to learn whether a given sample was part of the training data of a generative model.

View Article and Find Full Text PDF
Article Synopsis
  • The article DOI: 10.2196/47254 contained inaccuracies that needed to be addressed.
  • The correction aims to clarify specific findings or data presented in the original article.
  • This adjustment ensures the integrity and accuracy of the research shared with the public.
View Article and Find Full Text PDF

Background: Reference intervals (RIs) for patient test results are in standard use across many medical disciplines, allowing physicians to identify measurements indicating potentially pathological states with relative ease. The process of inferring cohort-specific RIs is, however, often ignored because of the high costs and cumbersome efforts associated with it. Sophisticated analysis tools are required to automatically infer relevant and locally specific RIs directly from routine laboratory data.

View Article and Find Full Text PDF

Hospital-acquired pressure injuries are a challenge for healthcare systems, and the nurse's role is essential in their prevention. The first step is risk assessment. The development of advanced data-driven methods based on machine learning techniques can improve risk assessment through the use of routinely collected data.

View Article and Find Full Text PDF

The Swiss Personalized Health Network (SPHN) is a government-funded initiative developing federated infrastructures for a responsible and efficient secondary use of health data for research purposes in compliance with the FAIR principles (Findable, Accessible, Interoperable and Reusable). We built a common standard infrastructure with a fit-for-purpose strategy to bring together health-related data and ease the work of both data providers to supply data in a standard manner and researchers by enhancing the quality of the collected data. As a result, the SPHN Resource Description Framework (RDF) schema was implemented together with a data ecosystem that encompasses data integration, validation tools, analysis helpers, training and documentation for representing health metadata and data in a consistent manner and reaching nationwide data interoperability goals.

View Article and Find Full Text PDF

Background: Medical coding is the process that converts clinical documentation into standard medical codes. Codes are used for several key purposes in a hospital (eg, insurance reimbursement and performance analysis); therefore, their optimization is crucial. With the rapid growth of natural language processing technologies, several solutions based on artificial intelligence have been proposed to aid in medical coding by automatically suggesting relevant codes for clinical documents.

View Article and Find Full Text PDF

In this study, we propose a unified evaluation framework for systematically assessing the utility-privacy trade-off of synthetic data generation (SDG) models. These SDG models are adapted to deal with longitudinal or tabular data stemming from electronic health records (EHR) containing both discrete and numeric features. Our evaluation framework considers different data sharing scenarios and attacker models.

View Article and Find Full Text PDF

Using real-world evidence in biomedical research, an indispensable complement to clinical trials, requires access to large quantities of patient data that are typically held separately by multiple healthcare institutions. We propose FAMHE, a novel federated analytics system that, based on multiparty homomorphic encryption (MHE), enables privacy-preserving analyses of distributed datasets by yielding highly accurate results without revealing any intermediate data. We demonstrate the applicability of FAMHE to essential biomedical analysis tasks, including Kaplan-Meier survival analysis in oncology and genome-wide association studies in medical genetics.

View Article and Find Full Text PDF

Background: Interoperability is a well-known challenge in medical informatics. Current trends in interoperability have moved from a data model technocentric approach to sustainable semantics, formal descriptive languages, and processes. Despite many initiatives and investments for decades, the interoperability challenge remains crucial.

View Article and Find Full Text PDF

The growing number of health-data breaches, the use of genomic databases for law enforcement purposes and the lack of transparency of personal genomics companies are raising unprecedented privacy concerns. To enable a secure exploration of genomic datasets with controlled and transparent data access, we propose a citizen-centric approach that combines cryptographic privacy-preserving technologies, such as homomorphic encryption and secure multi-party computation, with the auditability of blockchains. Our open-source implementation supports queries on the encrypted genomic data of hundreds of thousands of individuals, with minimal overhead.

View Article and Find Full Text PDF

Multisite medical data sharing is critical in modern clinical practice and medical research. The challenge is to conduct data sharing that preserves individual privacy and data utility. The shortcomings of traditional privacy-enhancing technologies mean that institutions rely upon bespoke data sharing contracts.

View Article and Find Full Text PDF

Global pandemics call for large and diverse healthcare data to study various risk factors, treatment options, and disease progression patterns. Despite the enormous efforts of many large data consortium initiatives, scientific community still lacks a secure and privacy-preserving infrastructure to support auditable data sharing and facilitate automated and legally compliant federated analysis on an international scale. Existing health informatics systems do not incorporate the latest progress in modern security and federated machine learning algorithms, which are poised to offer solutions.

View Article and Find Full Text PDF

Precision medicine aims to tailor prevention and treatment to individual data. Although different markers can be used (e.g.

View Article and Find Full Text PDF

Personalised medicine can improve both public and individual health by providing targeted preventative and therapeutic healthcare. However, patient health data must be shared between institutions and across jurisdictions for the benefits of personalised medicine to be realised. Whilst data protection, privacy, and research ethics laws protect patient confidentiality and safety they also may impede multisite research, particularly across jurisdictions.

View Article and Find Full Text PDF

MedCo is the first operational system that makes sensitive medical-data available for research in a simple, privacy-conscious and secure way. It enables a consortium of clinical sites to collectively protect their data and to securely share them with investigators, without single points of failure. In this short paper, we report on our ongoing effort for the operational deployment of MedCo within the context of the Swiss Personalized Health Network (SPHN) for the Swiss Molecular Tumor Board.

View Article and Find Full Text PDF

Medical studies are usually time consuming, cumbersome and extremely costly to perform, and for exploratory research, their results are also difficult to predict a priori. This is particularly the case for rare diseases, for which finding enough patients is difficult and usually requires an international-scale research. In this case, the process can be even more difficult due to the heterogeneity of data-protection regulations, making the data sharing process particularly hard.

View Article and Find Full Text PDF

One major obstacle to developing precision medicine to its full potential is the privacy concerns related to genomic-data sharing. Even though the academic community has proposed many solutions to protect genomic privacy, these so far have not been adopted in practice, mainly due to their impact on the data utility. We introduce GenoShare, a framework that enables individual citizens to understand and quantify the risks of revealing genome-related privacy-sensitive attributes (e.

View Article and Find Full Text PDF

The increasing number of health-data breaches is creating a complicated environment for medical-data sharing and, consequently, for medical progress. Therefore, the development of new solutions that can reassure clinical sites by enabling privacy-preserving sharing of sensitive medical data in compliance with stringent regulations (e.g.

View Article and Find Full Text PDF

Re-use of patients' health records can provide tremendous benefits for clinical research. Yet, when researchers need to access sensitive/identifying data, such as genomic data, in order to compile cohorts of well-characterized patients for specific studies, privacy and security concerns represent major obstacles that make such a procedure extremely difficult if not impossible. In this paper, we address the challenge of designing and deploying in a real operational setting an efficient privacy-preserving explorer for genetic cohorts.

View Article and Find Full Text PDF

The biomedical community is lagging in the adoption of cloud computing for the management of medical data. The primary obstacles are concerns about privacy and security. In this paper, we explore the feasibility of using advanced privacy-enhancing technologies in order to enable the sharing of sensitive clinical data in a public cloud.

View Article and Find Full Text PDF

Purpose: Protecting patient privacy is a major obstacle for the implementation of genomic-based medicine. Emerging privacy-enhancing technologies can become key enablers for managing sensitive genetic data. We studied physicians' attitude toward this kind of technology in order to derive insights that might foster their future adoption for clinical care.

View Article and Find Full Text PDF

Background: Cloud computing is becoming the preferred solution for efficiently dealing with the increasing amount of genomic data. Yet, outsourcing storage and processing sensitive information, such as genomic data, comes with important concerns related to privacy and security. This calls for new sophisticated techniques that ensure data protection from untrusted cloud providers and that still enable researchers to obtain useful information.

View Article and Find Full Text PDF