Active Learning (AL) has the potential to solve a major problem of digital pathology: the efficient acquisition of labeled data for machine learning algorithms. However, existing AL methods often struggle in realistic settings with artifacts, ambiguities, and class imbalances, as commonly seen in the medical field. The lack of precise uncertainty estimations leads to the acquisition of images with a low informative value.
View Article and Find Full Text PDFBulk tissue samples examined by gene expression studies are usually heterogeneous. The data gained from these samples display the confounding patterns of mixtures consisting of multiple cell types or similar cell types in various functional states, which hinders the elucidation of the molecular mechanisms underlying complex biological phenomena. A realistic approach to compensate for the limitations of experimentally separating homogenous cell populations from mixed tissues is to computationally identify cell-type specific patterns from bulk, heterogeneous measurements.
View Article and Find Full Text PDFThe transcription of a gene from its DNA template into an mRNA molecule is the first, and most heavily regulated, step in gene expression. Especially in bacteria, regulation is typically achieved via the binding of a transcription factor (protein) or small RNA molecule to the chromosomal region upstream of a regulated gene. The protein or RNA molecule recognizes a short, approximately conserved sequence within a gene's promoter region and, by binding to it, either enhances or represses expression of the nearby gene.
View Article and Find Full Text PDFBMC Bioinformatics
July 2009
Background: Hidden Markov models and hidden Boltzmann models are employed in computational biology and a variety of other scientific fields for a variety of analyses of sequential data. Whether the associated algorithms are used to compute an actual probability or, more generally, an odds ratio or some other score, a frequent requirement is that the error statistics of a given score be known. What is the chance that random data would achieve that score or better? What is the chance that a real signal would achieve a given score threshold?
Results: Here we present a novel general approach to estimating these false positive and true positive rates that is significantly more efficient than are existing general approaches.
Computational biology is replete with high-dimensional discrete prediction and inference problems. Dynamic programming recursions can be applied to several of the most important of these, including sequence alignment, RNA secondary-structure prediction, phylogenetic inference, and motif finding. In these problems, attention is frequently focused on some scalar quantity of interest, a score, such as an alignment score or the free energy of an RNA secondary structure.
View Article and Find Full Text PDFMeasurement of the the statistical significance of extreme sequence alignment scores is key to many important applications, but it is difficult. To precisely approximate alignment score significance, we draw random samples directly from a well chosen, importance-sampling probability distribution. We apply our technique to pairwise local sequence alignment of nucleic acid and amino acid sequences of length up to 1000.
View Article and Find Full Text PDFBioinformatics
August 2008
Motivation: A backtrace through a dynamic programming algorithm's intermediate results in search of an optimal path, or to sample paths according to an implied probability distribution, or as the second stage of a forward-backward algorithm, is a task of fundamental importance in computational biology. When there is insufficient space to store all intermediate results in high-speed memory (e.g.
View Article and Find Full Text PDFMotivation: Identification of functionally conserved regulatory elements in sequence data from closely related organisms is becoming feasible, due to the rapid growth of public sequence databases. Closely related organisms are most likely to have common regulatory motifs; however, the recent speciation of such organisms results in the high degree of correlation in their genome sequences, confounding the detection of functional elements. Additionally, alignment algorithms that use optimization techniques are limited to the detection of a single alignment that may not be representative.
View Article and Find Full Text PDFThe Gibbs Centroid Sampler is a software package designed for locating conserved elements in biopolymer sequences. The Gibbs Centroid Sampler reports a centroid alignment, i.e.
View Article and Find Full Text PDFBackground: When transcription factor binding sites are known for a particular transcription factor, it is possible to construct a motif model that can be used to scan sequences for additional sites. However, few statistically significant sites are revealed when a transcription factor binding site motif model is used to scan a genome-scale database.
Methods: We have developed a scanning algorithm, PhyloScan, which combines evidence from matching sites found in orthologous data from several related species with evidence from multiple sites within an intergenic region, to better detect regulons.
Approaches based upon sequence weights, to construct a position weight matrix of nucleotides from aligned inputs, are popular but little effort has been expended to measure their quality. We derive optimal sequence weights that minimize the sum of the variances of the estimators of base frequency parameters for sequences related by a phylogenetic tree. Using these we find that approaches based upon sequence weights can perform very poorly in comparison to approaches based upon a theoretically optimal maximum-likelihood method in the inference of the parameters of a position-weight matrix.
View Article and Find Full Text PDFStat Appl Genet Mol Biol
May 2006
Under the assumption that a significant motivation for sequencing the genomes of mammals is the resulting ability to help us locate and characterize functional DNA segments shared with humans, we have developed a statistical analysis to quantify the expected advantage. Examining uncertainty in terms of the width of a confidence interval, we show that uncertainty in the rate of nucleotide mutation can be shrunk by a factor of nearly four when nine mammals; human, chimpanzee, baboon, cat, dog, cow, pig, rat, mouse; are used instead of just two; human and mouse. Contrastingly, we show confidence interval shrinkage by a factor of only 1.
View Article and Find Full Text PDFA partial digestion of DNA (e.g. cosmid.
View Article and Find Full Text PDFUsing an extension of a statistical model given by E. Lander and M. Waterman, we define the a posteriori probability of a clone ordering based upon oligonucleotide hybridization data.
View Article and Find Full Text PDFWith the advent of RFLPs, genetic linkage maps are now being assembled for a number of organisms including both inbred experimental populations such as maize and outbred natural populations such as humans. Accurate construction of such genetic maps requires multipoint linkage analysis of particular types of pedigrees. We describe here a computer package, called MAPMAKER, designed specifically for this purpose.
View Article and Find Full Text PDFThe pathologic changes of the middle ear and mastoid bone mucosa in two pediatric patients with long standing chronic serous otitis media were studied by electron and light microscopy. The embryological development of the eustachian tube, middle ear cleft, and mastoid bone suggests a common physiological and anatomical continuity. The histological changes by light and electron microscopy in these patients demonstrates the metaplastic changes of the basal cell of the mucosa differentiating into mucous and keratin cells.
View Article and Find Full Text PDFThe authors evaluated the systemic and cerebral hemodynamic and metabolic effects of 1 h of hypotension to a mean arterial pressure of either 50 mmHg or 40 mmHg induced by intravenous adenosine or ATP in dogs maintained on 70% nitrous oxide and 0.1% halothane. Following the hypotensive period, brain biopsy specimens were taken for the determination of cerebral metabolites and calculation of the energy charge.
View Article and Find Full Text PDFTwenty-seven pigtailed monkeys (Macaca nemestrina) were subjected to 17 min of complete cerebral ischemia followed by 96 h of intensive care treatment. Fourteen of the monkeys were assigned randomly to the treatment group and received nimodipine 10 micrograms . kg-1 5 min postischemia followed by 1 microgram .
View Article and Find Full Text PDFBrain protection is the prevention or amelioration of neuronal damage occurring after a hypoxic or ischemic event. Controversies in this field focus on whether incomplete global ischemia may produce a worse insult than does complete global ischemia; whether barbiturates provide protection following complete global ischemia; and whether the calcium entry blockers have a role in brain protection. Current knowledge dictates that incomplete ischemia coupled with hyperglycemia will cause a severe cerebral lactic acidosis and produce a worse insult than does complete ischemia.
View Article and Find Full Text PDFTen minutes of cerebral ischemia was produced in 12 dogs by temporary ligation of the venae cavae and aorta. After reperfusion the dogs received the calcium entry blocker, flunarizine, 6 micrograms/kg infused over a ten minute period. Cerebral blood flow (CBF) and metabolism (CMRO2) were measured pre-ischemia and for 2 h post-ischemia in 6 dogs.
View Article and Find Full Text PDFThe systemic and cerebral effects of hypotension induced with isoflurane were examined in 12 dogs. Hypotension to a mean arterial pressure of either 50 mmHg or 40 mmHg for 1 h was produced by 2.5 +/- 0.
View Article and Find Full Text PDF