PLoS Comput Biol
January 2025
Machine learning has become a powerful tool for computational analysis in the biomedical sciences, with its effectiveness significantly enhanced by integrating domain-specific knowledge. This integration has give rise to informed machine learning, in contrast to studies that lack domain knowledge and treat all variables equally (uninformed machine learning). While the application of informed machine learning to bioinformatics and health informatics datasets has become more seamless, the likelihood of errors has also increased.
View Article and Find Full Text PDFBiomedical datasets are the mainstays of computational biology and health informatics projects, and can be found on multiple data platforms online or obtained from wet-lab biologists and physicians. The quality and the trustworthiness of these datasets, however, can sometimes be poor, producing bad results in turn, which can harm patients and data subjects. To address this problem, policy-makers, researchers, and consortia have proposed diverse regulations, guidelines, and scores to assess the quality and increase the reliability of datasets.
View Article and Find Full Text PDFPLoS Comput Biol
October 2024
The electrocardiogram (ECG) is a powerful tool to measure the electrical activity of the heart, and the analysis of its data can be useful to assess the patient's health. In particular, the computational analysis of electrocardiogram data, also called ECG signal processing, can reveal specific patterns or heart cycle trends which otherwise would be unnoticeable by medical experts. When performing ECG signal processing, however, it is easy to make mistakes and generate inflated, overoptimistic, or misleading results, which can lead to wrong diagnoses or prognoses and, in turn, could even contribute to bad medical decisions, damaging the health of the patient.
View Article and Find Full Text PDFPeerJ Comput Sci
September 2024
Electroencephalography (EEG) is a medical engineering technique aimed at recording the electric activity of the human brain. Brain signals derived from an EEG device can be processed and analyzed through computers by using digital signal processing, computational statistics, and machine learning techniques, that can lead to scientifically-relevant results and outcomes about how the brain works. In the last decades, the spread of EEG devices and the higher availability of EEG data, of computational resources, and of software packages for electroencephalography analysis has made EEG signal processing easier and faster to perform for any researcher worldwide.
View Article and Find Full Text PDFPangenomics is a relatively new scientific field which investigates the union of all the genomes of a clade. The word pan means everything in ancient Greek; the term pangenomics originally regarded genomes of bacteria and was later intended to refer to human genomes as well. Modern bioinformatics offers several tools to analyze pangenomics data, paving the way to an emerging field that we can call computational pangenomics.
View Article and Find Full Text PDFCongenital long QT syndrome (LQTS) represents a disorder of myocardial repolarization characterized by a prolongation of QTc interval on ECG, which can degenerate into fast polymorphic ventricular arrhythmias. The typical symptoms of LQTS are syncope and palpitations, mainly triggered by adrenergic stimuli, but it can also manifest with cardiac arrest. At least 17 genotypes have been associated with LQTS, with a specific genotype-phenotype relationship described for the three most common subtypes (LQTS1, -2, and -3).
View Article and Find Full Text PDFBrugada Syndrome is an "inherited" channelopathy characterized by a predisposition to syncope and sudden death. It typically presents in young adults but is also known to affect the pediatric population, even if the prevalence is low compared to the adult population. The diagnostic ECG pattern shows coved-type ST-segment elevation in the right precordial leads, occurring spontaneously or after provocative drug tests with IV administration of Class I antiarrhythmic drugs.
View Article and Find Full Text PDFBackground: Systemic inflammatory response syndrome (SIRS) and sepsis are the most common causes of in-hospital death. However, the characteristics associated with the improvement in the patient conditions during the ICU stay were not fully elucidated for each population as well as the possible differences between the two.
Goal: The aim of this study is to highlight the differences between the prognostic clinical features for the survival of patients diagnosed with SIRS and those of patients diagnosed with sepsis by using a multi-variable predictive modeling approach with a reduced set of easily available measurements collected at the admission to the intensive care unit (ICU).
PeerJ Comput Sci
February 2024
Diabetes is a metabolic disorder that affects more than 420 million of people worldwide, and it is caused by the presence of a high level of sugar in blood for a long period. Diabetes can have serious long-term health consequences, such as cardiovascular diseases, strokes, chronic kidney diseases, foot ulcers, retinopathy, and others. Even if common, this disease is uneasy to spot, because it often comes with no symptoms.
View Article and Find Full Text PDFGlioblastoma multiforme (GM) is a malignant tumor of the central nervous system considered to be highly aggressive and often carrying a terrible survival prognosis. An accurate prognosis is therefore pivotal for deciding a good treatment plan for patients. In this context, computational intelligence applied to data of electronic health records (EHRs) of patients diagnosed with this disease can be useful to predict the patients' survival time.
View Article and Find Full Text PDFFuzzy logic is useful tool to describe and represent biological or medical scenarios, where often states and outcomes are not only completely true or completely false, but rather partially true or partially false. Despite its usefulness and spread, fuzzy logic modeling might easily be done in the wrong way, especially by beginners and unexperienced researchers, who might overlook some important aspects or might make common mistakes. Malpractices and pitfalls, in turn, can lead to wrong or overoptimistic, inflated results, with negative consequences to the biomedical research community trying to comprehend a particular phenomenon, or even to patients suffering from the investigated disease.
View Article and Find Full Text PDFIn the VISION trial, [Lu]Lu-PSMA-617 (Lu-PSMA-617) plus protocol-permitted standard of care significantly improved overall survival and radiographic progression-free survival compared with standard of care alone in patients with prostate-specific membrane antigen-positive metastatic castration-resistant prostate cancer. This VISION dosimetry substudy quantified absorbed doses of Lu-PSMA-617 in the kidneys and other organs. Participants were a separate cohort of 30 nonrandomized patients receiving standard of care plus Lu-PSMA-617 at 7.
View Article and Find Full Text PDFObjective: Seek new candidate prognostic markers for neuroblastoma outcome, relapse or progression.
Materials And Methods: In this multicentre and retrospective study, Random Forests coupled with recursive feature elimination techniques were applied to electronic records (55 clinical features) of 3034 neuroblastoma patients. To assess model performance and feature importance, dataset was split into a training set (80%) and a test set (20%).
Some scientific studies involve huge amounts of bioinformatics data that cannot be analyzed on personal computers usually employed by researchers for day-to-day activities but rather necessitate effective computational infrastructures that can work in a distributed way. For this purpose, distributed computing systems have become useful tools to analyze large amounts of bioinformatics data and to generate relevant results on virtual environments, where software can be executed for hours or even days without affecting the personal computer or laptop of a researcher. Even if distributed computing resources have become pivotal in multiple bioinformatics laboratories, often researchers and students use them in the wrong ways, making mistakes that can cause the distributed computers to underperform or that can even generate wrong outcomes.
View Article and Find Full Text PDFData are the most important elements of bioinformatics: Computational analysis of bioinformatics data, in fact, can help researchers infer new knowledge about biology, chemistry, biophysics, and sometimes even medicine, influencing treatments and therapies for patients. Bioinformatics and high-throughput biological data coming from different sources can even be more helpful, because each of these different data chunks can provide alternative, complementary information about a specific biological phenomenon, similar to multiple photos of the same subject taken from different angles. In this context, the integration of bioinformatics and high-throughput biological data gets a pivotal role in running a successful bioinformatics study.
View Article and Find Full Text PDFEven if assessing binary classifications is a common task in scientific research, no consensus on a single statistic summarizing the confusion matrix has been reached so far. In recent studies, we demonstrated the advantages of the Matthews correlation coefficient (MCC) over other popular rates such as cross-entropy error, F score, accuracy, balanced accuracy, bookmaker informedness, diagnostic odds ratio, Brier score, and Cohen's kappa. In this study, we compared the MCC to other two statistics: prevalence threshold (PT), frequently used in obstetrics and gynecology, and Fowlkes-Mallows index, a metric employed in fuzzy logic and drug discovery.
View Article and Find Full Text PDFNeuroblastoma is a childhood neurological tumor which affects hundreds of thousands of children worldwide, and information about its prognosis can be pivotal for patients, their families, and clinicians. One of the main goals in the related bioinformatics analyses is to provide stable genetic signatures able to include genes whose expression levels can be effective to predict the prognosis of the patients. In this study, we collected the prognostic signatures for neuroblastoma published in the biomedical literature, and noticed that the most frequent genes present among them were three: AHCY, DPYLS3, and NME1.
View Article and Find Full Text PDFBioinformatics has become a key aspect of the biomedical research programmes of many hospitals' scientific centres, and the establishment of bioinformatics facilities within hospitals has become a common practice worldwide. Bioinformaticians working in these facilities provide computational biology support to medical doctors and principal investigators who are daily dealing with data of patients to analyze. These bioinformatics analysts, although pivotal, usually do not receive formal training for this job.
View Article and Find Full Text PDFBinary classification is a common task for which machine learning and computational statistics are used, and the area under the receiver operating characteristic curve (ROC AUC) has become the common standard metric to evaluate binary classifications in most scientific fields. The ROC curve has true positive rate (also called sensitivity or recall) on the y axis and false positive rate on the x axis, and the ROC AUC can range from 0 (worst result) to 1 (perfect result). The ROC AUC, however, has several flaws and drawbacks.
View Article and Find Full Text PDFMedical imaging is a great asset for modern medicine, since it allows physicians to spatially interrogate a disease site, resulting in precise intervention for diagnosis and treatment, and to observe particular aspect of patients' conditions that otherwise would not be noticeable. Computational analysis of medical images, moreover, can allow the discovery of disease patterns and correlations among cohorts of patients with the same disease, thus suggesting common causes or providing useful information for better therapies and cures. Machine learning and deep learning applied to medical images, in particular, have produced new, unprecedented results that can pave the way to advanced frontiers of medical discoveries.
View Article and Find Full Text PDFBackground: Plexiform neurofibromas are benign neoplasms that develop in 20-50% children with neurofibromatosis type 1 (NF1). Selumetinib was approved as treatment for symptomatic and inoperable plexiform neurofibromas. Subclinical left ventricular ejection fraction reduction is a less common effect of selumetinib.
View Article and Find Full Text PDFApplying computational statistics or machine learning methods to data is a key component of many scientific studies, in any field, but alone might not be sufficient to generate robust and reliable outcomes and results. Before applying any discovery method, preprocessing steps are necessary to prepare the data to the computational analysis. In this framework, data cleaning and feature engineering are key pillars of any scientific study involving data analysis and that should be adequately designed and performed since the first phases of the project.
View Article and Find Full Text PDFUnlabelled: In this multicenter retrospective study we aimed to evaluate the outcome of cardiac involvement in children affected by multisystem inflammatory syndrome (MIS-C), assessed through cardiac magnetic resonance (CMR). Children referring to three Italian tertiary pediatric centers between February 2020 and November 2021 with a diagnosis of MIS-C, who underwent CMR during a follow-up visit, were enrolled. Demographic, clinical, laboratory, treatment, and outcome data were collected.
View Article and Find Full Text PDFFunctional enrichment analysis or pathway enrichment analysis (PEA) is a bioinformatics technique which identifies the most over-represented biological pathways in a list of genes compared to those that would be associated with them by chance. These biological functions are found on bioinformatics annotated databases such as The Gene Ontology or KEGG; the more abundant pathways are identified through statistical techniques such as Fisher's exact test. All PEA tools require a list of genes as input.
View Article and Find Full Text PDF