Understanding the evolution of Severe Acute Respiratory Syndrome Coronavirus (SARS-CoV-2) and its relationship to other coronaviruses in the wild is crucial for preventing future virus outbreaks. While the origin of the SARS-CoV-2 pandemic remains uncertain, mounting evidence suggests the direct involvement of the bat and pangolin coronaviruses in the evolution of the SARS-CoV-2 genome. To unravel the early days of a probable zoonotic spillover event, we analyzed genomic data from various coronavirus strains from both human and wild hosts.
View Article and Find Full Text PDFCRISPR/Cas9 (Clustered Regularly Interspaced Short Palindromic Repeats and CRISPR-associated protein 9) is a popular and effective two-component technology used for targeted genetic manipulation. It is currently the most versatile and accurate method of gene and genome editing, which benefits from a large variety of practical applications. For example, in biomedicine, it has been used in research related to cancer, virus infections, pathogen detection, and genetic diseases.
View Article and Find Full Text PDFThis paper introduces an optimal algorithm for solving the discrete grid-based coverage path planning (CPP) problem. This problem consists in finding a path that covers a given region completely. First, we propose a CPP-solving baseline algorithm based on the iterative deepening depth-first search (ID-DFS) approach.
View Article and Find Full Text PDFNext basket recommendation is a critical task in market basket data analysis. It is particularly important in grocery shopping, where grocery lists are an essential part of shopping habits of many customers. In this work, we first present a new grocery Recommender System available on the MyGroceryTour platform.
View Article and Find Full Text PDFMotivation: Each gene has its own evolutionary history which can substantially differ from evolutionary histories of other genes. For example, some individual genes or operons can be affected by specific horizontal gene transfer or recombination events. Thus, the evolutionary history of each gene should be represented by its own phylogenetic tree which may display different evolutionary patterns from the species tree that accounts for the main patterns of vertical descent.
View Article and Find Full Text PDFMotivation: Accurate detection of sequence similarity and homologous recombination are essential parts of many evolutionary analyses.
Results: We have developed SimPlot++, an open-source multiplatform application implemented in Python, which can be used to produce publication quality sequence similarity plots using 63 nucleotide and 20 amino acid distance models, to detect intergenic and intragenic recombination events using Φ, Max-χ2, NSS or proportion tests, and to generate and analyze interactive sequence similarity networks. SimPlot++ supports multicore data processing and provides useful distance calculability diagnostics.
Recent years have seen a steep rise in the number of skin cancer detection applications. While modern advances in deep learning made possible reaching new heights in terms of classification accuracy, no publicly available skin cancer detection software provide confidence estimates for these predictions. We present DUNEScan (Deep Uncertainty Estimation for Skin Cancer), a web server that performs an intuitive in-depth analysis of uncertainty in commonly used skin cancer classification models based on convolutional neural networks (CNNs).
View Article and Find Full Text PDFAccurate automated medical image recognition, including classification and segmentation, is one of the most challenging tasks in medical image analysis. Recently, deep learning methods have achieved remarkable success in medical image classification and segmentation, clearly becoming the state-of-the-art methods. However, most of these methods are unable to provide uncertainty quantification (UQ) for their output, often being overconfident, which can lead to disastrous consequences.
View Article and Find Full Text PDFMotivation: Off-target predictions are crucial in gene editing research. Recently, significant progress has been made in the field of prediction of off-target mutations, particularly with CRISPR-Cas9 data, thanks to the use of deep learning. CRISPR-Cas9 is a gene editing technique which allows manipulation of DNA fragments.
View Article and Find Full Text PDFBackground: The SARS-CoV-2 pandemic is one of the greatest global medical and social challenges that have emerged in recent history. Human coronavirus strains discovered during previous SARS outbreaks have been hypothesized to pass from bats to humans using intermediate hosts, e.g.
View Article and Find Full Text PDFMotivation: Phylogenetic trees and the methods for their analysis have played a key role in many evolutionary, ecological and bioinformatics studies. Alternatively, phylogenetic networks have been widely used to analyze and represent complex reticulate evolutionary processes which cannot be adequately studied using traditional phylogenetic methods. These processes include, among others, hybridization, horizontal gene transfer, and genetic recombination.
View Article and Find Full Text PDFExplaining the evolution of animals requires ecological, developmental, paleontological, and phylogenetic considerations because organismal traits are affected by complex evolutionary processes. Modeling a plurality of processes, operating at distinct time-scales on potentially interdependent traits, can benefit from approaches that are complementary treatments to phylogenetics. Here, we developed an inclusive network approach, implemented in the command line software ComponentGrapher, and analyzed trait co-occurrence of rhinocerotoid mammals.
View Article and Find Full Text PDFComput Methods Programs Biomed
October 2019
Background And Objective: Coronary artery disease (CAD) is one of the commonest diseases around the world. An early and accurate diagnosis of CAD allows a timely administration of appropriate treatment and helps to reduce the mortality. Herein, we describe an innovative machine learning methodology that enables an accurate detection of CAD and apply it to data collected from Iranian patients.
View Article and Find Full Text PDFIEEE/ACM Trans Comput Biol Bioinform
January 2022
Considerable efforts have been made over the last decades to improve the robustness of clustering algorithms against noise features and outliers, known to be important sources of error in clustering. Outliers dominate the sum-of-the-squares calculations and generate cluster overlap, thus leading to unreliable clustering results. They can be particularly detrimental in computational biology, e.
View Article and Find Full Text PDFWart disease (WD) is a skin illness on the human body which is caused by the human papillomavirus (HPV). This study mainly concentrates on common and plantar warts. There are various treatment methods for this disease, including the popular immunotherapy and cryotherapy methods.
View Article and Find Full Text PDFThe emergence of functional specialization is a core problem in biology. In this work we focus on the emergence of reproductive (germ) and vegetative viability-enhancing (soma) cell functions (or germ-soma specialization). We consider a group of cells and assume that they contribute to two different evolutionary tasks, fecundity and viability.
View Article and Find Full Text PDFBackground: Gene trees carry important information about specific evolutionary patterns which characterize the evolution of the corresponding gene families. However, a reliable species consensus tree cannot be inferred from a multiple sequence alignment of a single gene family or from the concatenation of alignments corresponding to gene families having different evolutionary histories. These evolutionary histories can be quite different due to horizontal transfer events or to ancient gene duplications which cause the emergence of paralogs within a genome.
View Article and Find Full Text PDFData generated by high-throughput screening (HTS) technologies are prone to spatial bias. Traditionally, bias correction methods used in HTS assume either a simple additive or, more recently, a simple multiplicative spatial bias model. These models do not, however, always provide an accurate correction of measurements in wells located at the intersection of rows and columns affected by spatial bias.
View Article and Find Full Text PDFSpatial bias continues to be a major challenge in high-throughput screening technologies. Its successful detection and elimination are critical for identifying the most promising drug candidates. Here, we examine experimental small molecule assays from the popular ChemBank database and show that screening data are widely affected by both assay-specific and plate-specific spatial biases.
View Article and Find Full Text PDFMotivation: Considerable attention has been paid recently to improve data quality in high-throughput screening (HTS) and high-content screening (HCS) technologies widely used in drug development and chemical toxicity research. However, several environmentally- and procedurally-induced spatial biases in experimental HTS and HCS screens decrease measurement accuracy, leading to increased numbers of false positives and false negatives in hit selection. Although effective bias correction methods and software have been developed over the past decades, almost all of these tools have been designed to reduce the effect of additive bias only.
View Article and Find Full Text PDFBackground: Curious parallels between the processes of species and language evolution have been observed by many researchers. Retracing the evolution of Indo-European (IE) languages remains one of the most intriguing intellectual challenges in historical linguistics. Most of the IE language studies use the traditional phylogenetic tree model to represent the evolution of natural languages, thus not taking into account reticulate evolutionary events, such as language hybridization and word borrowing which can be associated with species hybridization and horizontal gene transfer, respectively.
View Article and Find Full Text PDFVarious types of genome and gene similarity networks along with their characteristics have been increasingly used for retracing different kinds of evolutionary and ecological relationships. Here, we present a new polynomial time algorithm and the corresponding software (BRIDES) to provide characterization of different types of paths existing in evolving (or augmented) similarity networks under the constraint that such paths contain at least one node that was not present in the original network. These different paths are denoted as Breakthroughs, Roadblocks, Impasses, Detours, Equal paths, and Shortcuts.
View Article and Find Full Text PDFBackground: Workflows, or computational pipelines, consisting of collections of multiple linked tasks are becoming more and more popular in many scientific fields, including computational biology. For example, simulation studies, which are now a must for statistical validation of new bioinformatics methods and software, are frequently carried out using the available workflow platforms. Workflows are typically organized to minimize the total execution time and to maximize the efficiency of the included operations.
View Article and Find Full Text PDFSignificant efforts have been made recently to improve data throughput and data quality in screening technologies related to drug design. The modern pharmaceutical industry relies heavily on high-throughput screening (HTS) and high-content screening (HCS) technologies, which include small molecule, complementary DNA (cDNA) and RNA interference (RNAi) types of screening. Data generated by these screening technologies are subject to several environmental and procedural systematic biases, which introduce errors into the hit identification process.
View Article and Find Full Text PDF