Background: Single Amino Acid Polymorphisms (SAPs) or nonsynonymous Single Nucleotide Variants (nsSNVs) are the most common genetic variations. They result from missense mutations where a single base pair substitution changes the genetic code in such a way that the triplet of bases (codon) at a given position is coding a different amino acid. Since genetic mutations sometimes cause genetic diseases, it is important to comprehend and foresee which variations are harmful and which ones are neutral (not causing changes in the phenotype).
View Article and Find Full Text PDFSupport Care Cancer
February 2023
Introduction: Using GWAS data derived from a large collaborative trial (ECOG-5103), we identified a cluster of 267 SNPs which predicted CIPN in treatment-naive patients as reported in Part 1 of this study. To assess the functional and pathological implications of this set, we identified collective gene signatures were and evaluated the informational value of those signatures in defining CIPN's pathogenesis.
Methods: In Part 1, we analyzed GWAS data derived from ECOG-5103, first identifying those SNPs that were most strongly associated with CIPN using Fisher's ratio.
Background: Chemotherapy-induced peripheral neuropathy (CIPN) is a common toxicity of taxanes for which there is no effective intervention. Genomic CIPN risk determination has yielded promising, but inconsistent results. The present study assessed the utility of a collective SNP cluster identified using novel analytics to describe taxane-associated CIPN risk.
View Article and Find Full Text PDFNoise is a basic ingredient in data, since observed data are always contaminated by unwanted deviations, i.e., noise, which, in the case of overdetermined systems (with more data than model parameters), cause the corresponding linear system of equations to have an imperfect solution.
View Article and Find Full Text PDFComput Biol Med
October 2022
Background: To understand the transcriptomic response to SARS-CoV-2 infection, is of the utmost importance to design diagnostic tools predicting the severity of the infection.
Methods: We have performed a deep sampling analysis of the viral transcriptomic data oriented towards drug repositioning. Using different samplers, the basic principle of this methodology the biological invariance, which means that the pathways altered by the disease, should be independent on the algorithm used to unravel them.
Big data in health care is a fast-growing field and a new paradigm that is transforming case-based studies to large-scale, data-driven research. As big data is dependent on the advancement of new data standards, technology, and relevant research, the future development of big data applications holds foreseeable promise in the modern day health care revolution. Enormously large, rapidly growing collections of biomedical omics-data (genomics, proteomics, transcriptomics, metabolomics, glycomics, etc.
View Article and Find Full Text PDFGiven the high prevalence of imported diseases in immigrant populations, it has postulated the need to establish screening programs that allow their early diagnosis and treatment. We present a mathematical model based on machine learning methodologies to contribute to the design of screening programs in this population. We conducted a retrospective cross-sectional screening program of imported diseases in all immigrant patients who attended the Tropical Medicine Unit between January 2009 and December 2016.
View Article and Find Full Text PDFThe prediction of the dynamics of the COVID-19 outbreak and the corresponding needs of the health care system (COVID-19 patients' admissions, the number of critically ill patients, need for intensive care units, etc.) is based on the combination of a limited growth model (Verhulst model) and a short-term predictive model that allows predictions to be made for the following day. In both cases, the uncertainty analysis of the prediction is performed, i.
View Article and Find Full Text PDFAn outbreak of the novel COVID-19 virus occurred during February 2020 onwards in almost all the European countries, including Spain. This study covers the correlation found between weather variables (Maximum Temperature, Minimum Temperature, Mean Temperature, Atmospheric Pressure, Daily Rainfall, Daily Sun hours) and the coronavirus propagation in Spain. A strong relationship is found when correlating the virus spread to the mean temperature, minimum temperature, and atmospheric pressure in different Spanish provinces.
View Article and Find Full Text PDFGlucocorticoid (GC) resistance complicates the treatment of ~10-20% of children with nephrotic syndrome (NS), yet the molecular basis for resistance remains unclear. We used RNAseq analysis and in silico algorithm-based approaches on peripheral blood leukocytes from 12 children both at initial NS presentation and after ~7 weeks of GC therapy to identify a 12-gene panel able to differentiate steroid resistant NS (SRNS) from steroid-sensitive NS (SSNS). Among this panel, subsequent validation and analyses of one biologically relevant candidate, sulfatase 2 (SULF2), in up to a total of 66 children, revealed that both SULF2 leukocyte expression and plasma arylsulfatase activity Post/Pre therapy ratios were greater in SSNS vs.
View Article and Find Full Text PDFArtificial intelligence methods may help in unveiling information that is hidden in high-dimensional oncological data. Flow cytometry studies of haematological malignancies provide quantitative data with the potential to be used for the construction of response biomarkers. Many computational methods from the bioinformatics toolbox can be applied to these data, but they have not been exploited in their full potential in leukaemias, specifically for the case of childhood B-cell Acute Lymphoblastic Leukaemia.
View Article and Find Full Text PDFWe discuss the use of the regularized linear discriminant analysis (LDA) as a model reduction technique combined with particle swarm optimization (PSO) in protein tertiary structure prediction, followed by structure refinement based on singular value decomposition (SVD) and PSO. The algorithm presented in this paper corresponds to the category of template-based modeling. The algorithm performs a preselection of protein templates before constructing a lower dimensional subspace via a regularized LDA.
View Article and Find Full Text PDFWe present the analysis of the defective genetic pathways of the Late-Onset Alzheimer's Disease (LOAD) compared to the Mild Cognitive Impairment (MCI) and Healthy Controls (HC) using different sampling methodologies. These algorithms sample the uncertainty space that is intrinsic to any kind of highly underdetermined phenotype prediction problem, by looking for the minimum-scale signatures (header genes) corresponding to different random holdouts. The biological pathways can be identified performing posterior analysis of these signatures established via cross-validation holdouts and plugging the set of most frequently sampled genes into different ontological platforms.
View Article and Find Full Text PDFThe complexity of orphan diseases, which are those that do not have an effective treatment, together with the high dimensionality of the genetic data used for their analysis and the high degree of uncertainty in the understanding of the mechanisms and genetic pathways which are involved in their development, motivate the use of advanced techniques of artificial intelligence and in-depth knowledge of molecular biology, which is crucial in order to find plausible solutions in drug design, including drug repositioning. Particularly, we show that the use of robust deep sampling methodologies of the altered genetics serves to obtain meaningful results and dramatically decreases the cost of research and development in drug design, influencing very positively the use of precision medicine and the outcomes in patients. The target-centric approach and the use of strong prior hypotheses that are not matched against reality (disease genetic data) are undoubtedly the cause of the high number of drug design failures and attrition rates.
View Article and Find Full Text PDFBackground: Phenotype prediction problems are usually considered ill-posed, as the amount of samples is very limited with respect to the scrutinized genetic probes. This fact complicates the sampling of the defective genetic pathways due to the high number of possible discriminatory genetic networks involved. In this research, we outline three novel sampling algorithms utilized to identify, classify and characterize the defective pathways in phenotype prediction problems, such as the Fisher's ratio sampler, the Holdout sampler and the Random sampler, and apply each one to the analysis of genetic pathways involved in tumor behavior and outcomes of triple negative breast cancers (TNBC).
View Article and Find Full Text PDFAccurate prediction of protein stability changes resulting from amino acid substitutions is of utmost importance in medicine to better understand which mutations are deleterious, leading to diseases, and which are neutral. Since conducting wet lab experiments to get a better understanding of protein mutations is costly and time consuming, and because of huge number of possible mutations the need of computational methods that could accurately predict effects of amino acid mutations is of greatest importance. In this research, we present a robust methodology to predict the energy changes of a proteins upon mutations.
View Article and Find Full Text PDFBackground: Although some studies show that there could be a genetic predisposition to develop Multiple Sclerosis (MS), attempts to find genetic signatures related to MS diagnosis and development are extremely rare.
Method: We carried out a retrospective analysis of two different microarray datasets, using machine learning techniques to understand the defective pathways involved in this disease. We have modeled two data sets that are publicly accessible.
We present the analysis of defective pathways in multiple myeloma (MM) using two recently developed sampling algorithms of the biological pathways: The Fisher's ratio sampler, and the holdout sampler. We performed the retrospective analyses of different gene expression datasets concerning different aspects of the disease, such as the existing difference between bone marrow stromal cells in MM and healthy controls (HC), the gene expression profiling of CD34+ cells in MM and HC, the difference between hyperdiploid and non-hyperdiploid myelomas, and the prediction of the chromosome 13 deletion, to provide a deeper insight into the molecular mechanisms involved in the disease. Our analysis has shown the importance of different altered pathways related to glycosylation, infectious disease, immune system response, different aspects of metabolism, DNA repair, protein recycling and regulation of the transcription of genes involved in the differentiation of myeloid cells.
View Article and Find Full Text PDFSarcopenia is an age-related multifactorial process that involved several biological mechanisms, whose specific contribution and interplay is still unknown. The present study proposes prognostic networks based on machine learning approaches to unravel the interplay among those biological mechanisms mainly involved in the development of Sarcopenia. After analyzing 114 biological and clinical variables in adults older than 70 years, and using all the biological prognostic networks detected by machine learning with accuracy higher than 82%, we designed a consensus classifier based on majority vote that improve the predictive accuracy of Sarcopenia up to 91%.
View Article and Find Full Text PDFAims: It is known that matrix metalloproteinase (MMP)-11 has a role in tumour development and progression, and also that immune cells can influence cancer cells to increase their proliferative and invasive properties. The aim of the present study was to propose the evaluation of MMP11 expression by intratumoral mononuclear inflammatory cells (MICs) as a useful biological marker for breast cancer prognosis.
Methods And Results: This study comprised 246 women with invasive breast carcinoma, and a long follow-up period.
Expert Opin Drug Discov
August 2019
: Drug discovery is the process through which potential new compounds are identified by means of biology, chemistry, and pharmacology. Due to the high complexity of genomic data, AI techniques are increasingly needed to help reduce this and aid the adoption of optimal decisions. Phenotypic prediction is of particular use to drug discovery and precision medicine where sets of genes that predict a given phenotype are determined.
View Article and Find Full Text PDFWe discuss the relationship between the problem of protein tertiary structure prediction from the amino acid sequence and the uncertainty analysis. The algorithm presented in this paper belongs to the category of decoy-based modeling, where different known protein models are used to establish a low dimensional space via principal component analysis. The low dimensional space is utilized to perform an energy optimization via a family of very explorative particle swarm optimizers to find the global minimum.
View Article and Find Full Text PDFObjectives: Fibromyalgia syndrome (FMS) is a chronic and often debilitating condition that is characterized by persistent fatigue, pain, bowel abnormalities, and sleep disturbances. Currently, there are no definitive prognostic or diagnostic biomarkers for FMS. This study attempted to utilize a novel predictive algorithm to identify a group of genes whose differential expression discriminated individuals with FMS diagnosis from healthy controls.
View Article and Find Full Text PDFCancer-related fatigue (CRF) is a common burden in cancer patients and little is known about its underlying mechanism. The primary aim of this study was to identify gene signatures predictive of post-radiotherapy fatigue in prostate cancer patients. We employed Fisher Linear Discriminant Analysis (LDA) to identify predictive genes using whole genome microarray data from 36 men with prostate cancer.
View Article and Find Full Text PDF