The tutelage of our mentors as scientists included the analogy that writing a good scientific paper was an exercise in storytelling that omitted unessential details that did not move the story forward or that detracted from the overall message. However, the advice to not get lost in the details had an important flaw. In science, it is the many details of the data themselves and the methods used to generate and analyze them that give conclusions their probative meaning.
View Article and Find Full Text PDFWhile the common procedure of statistical significance testing and its accompanying concept of p-values have long been surrounded by controversy, renewed concern has been triggered by the replication crisis in science. Many blame statistical significance tests themselves, and some regard them as sufficiently damaging to scientific practice as to warrant being abandoned. We take a contrary position, arguing that the central criticisms arise from misunderstanding and misusing the statistical tools, and that in fact the purported remedies themselves risk damaging science.
View Article and Find Full Text PDFThe F-measure, also known as the F1-score, is widely used to assess the performance of classification algorithms. However, some researchers find it lacking in intuitive interpretation, questioning the appropriateness of combining two aspects of performance as conceptually distinct as precision and recall, and also questioning whether the harmonic mean is the best way to combine them. To ease this concern, we describe a simple transformation of the F-measure, which we call (F-star), which has an immediate practical interpretation.
View Article and Find Full Text PDFAI systems will only fulfill their promise for society if they can be relied upon. This means that the role and task of the system must be properly formulated; that the system must be bug free, be based on properly representative data, and can cope with anomalies and data quality issues; and that its output is sufficiently accurate for the task.
View Article and Find Full Text PDFThe hazardous emissions of crumb rubber (CR) modified asphalt during construction has been a concern for a long period. This study aims to identify the emission components in the CR modified asphalt in traditional hot mix asphalt (HMA) and with recently developed warm mix asphalt (WMA). The dynamic headspace gas chromatography-mass spectrometry (GCMS) was employed for identifying the emission of asphalt binders at 120°C, 140°C and 160°C.
View Article and Find Full Text PDFReady data availability, cheap storage capacity, and powerful tools for extracting information from data have the potential to significantly enhance the human condition. However, as with all advanced technologies, this comes with the potential for misuse. Ethical oversight and constraints are needed to ensure that an appropriate balance is reached.
View Article and Find Full Text PDFPhilos Trans A Math Phys Eng Sci
April 2015
The nature of statistics has changed over time. It was originally concerned with descriptive 'matters of state'--with summarizing population numbers, economic strength and social conditions. But during the course of the twentieth century its aim broadened to include inference--how to use data to shed light on underlying mechanisms, about what might happen in the future, about what would happen if certain actions were taken.
View Article and Find Full Text PDFBecause accurate diagnosis lies at the heart of medicine, it is important to be able to evaluate the effectiveness of diagnostic tests. A variety of accuracy measures are used. One particularly widely used measure is the AUC, the area under the receiver operating characteristic (ROC) curve.
View Article and Find Full Text PDFVolatile organic compounds (VOCs) are a major concern for indoor air pollution because of the impacts on human health. In recent years, interest has increased in the development and design of activated carbon filters for removing VOCs from indoor air. Although extensive information is available on sources, concentrations, and types of indoor VOCs, there is little or no information on the performance of indoor air adsorption systems for removing low concentrations of primary VOCs.
View Article and Find Full Text PDFWater Environ Res
February 2009
An externally validated model was developed to predict influent concentrations of natural estrogens at municipal wastewater plants. The model kinetic constant for the first-order biotransformation of estradiol into estrone, k(E2), was calculated as 0.030 hour(-1) (ranging from -0.
View Article and Find Full Text PDFThis paper presents a simplified approach and user-oriented solutions to the homogeneous surface diffusion model (HSDM) equations for determining the surface diffusivity using a batch reactor system. Once the surface diffusivity is known, this model could also be used to estimate the performance of activated carbon (AC) applications as a function of contact time. In addition, fixed-bed performance can be predicted using the user-oriented solutions to the HSDM for fixed beds.
View Article and Find Full Text PDFThere are increasing concerns about indoor volatile organic compounds (VOCs) regarding their health effects and frequent occurrence. Adsorption using granular activated carbon (GAC) is a safe methodology for removing VOCs from indoor air. Although GAC has been widely used to remove VOCs from indoor air, the use of activated carbon fiber cloth (ACFC) is a promising substitute to the conventional activated carbon because of its regenerative properties; hence, this paper provides promising results for the application of ACFC as a regenerative adsorbent for the removal of VOCs from indoor air.
View Article and Find Full Text PDFStat Appl Genet Mol Biol
May 2010
The performance results of a wide range of different classifiers applied to proteomic mass spectra data, in a blind comparative assessment organised by Bart Mertens, are reviewed. The different approaches are summarised, issues of how to evaluate and compare the predictions are described, and the results of the different methods are examined. Although the different methods perform differently, their rank ordering varies according to how one measures performance, so that one cannot draw unequivocal conclusions about which is 'best.
View Article and Find Full Text PDFThe aim of this study is to understand the parameters regulating calcium ion distribution in leaves. Accumulation of ions in leaf tissue is in part dependent on import from the xylem. This import via the transpiration stream is more important for ions such as calcium that are xylem but not phloem mobile and cannot therefore be retranslocated.
View Article and Find Full Text PDFJ Air Waste Manag Assoc
September 2007
Photocatalytic oxidation (PCO) was investigated in a bench-scale reactor for the abatement of two airborne organic contaminants: toluene and ethanol. A mathematical model that includes the impacts of light intensity, initial contaminant concentration, catalyst thickness, and relative humidity (RH) on the degradation of organic contaminants in a photocatalytic reactor was developed to describe this process. The commercially available catalyst Degussa-PtTiO2 was selected to compare with the MTU-PtTiO2-350 catalyst, which was synthesized by the sol-gel process, platinized, and calcined at 350 degrees C.
View Article and Find Full Text PDFData mining is the discovery of interesting, unexpected or valuable structures in large datasets. As such, it has two rather different aspects. One of these concerns large-scale, 'global' structures, and the aim is to model the shapes, or features of the shapes, of distributions.
View Article and Find Full Text PDFA high-efficiency particulate air (HEPA)-carbon filtration system was developed by the Access Business Group, LLC, to reduce the indoor levels of polychlorinated dibenzo-p-dioxins and dibenzofurans (PCDD/Fs). The HEPA filter removes the particle-bound PCDD/Fs, and the carbon filter removes the gaseous fraction. Because of the toxicity of PCDD/Fs, it is very difficult to handle them in the laboratory.
View Article and Find Full Text PDFThe banks have been accumulating huge data bases for many years and are increasingly turning to statistics to provide insight into customer behaviour, among other things. Credit risk is an important issue and certain stochastic models have been developed in recent years to describe and predict loan default. Two of the major models currently used in the industry are considered here, and various ways of extending their application to the case where a loan is repaid in installments are explored.
View Article and Find Full Text PDFWe present a method for Bayesian model-based hierarchical coclustering of gene expression data and use it to study the temporal transcription responses of an Anopheles gambiae cell line upon challenge with multiple microbial elicitors. The method fits statistical regression models to the gene expression time series for each experiment and performs coclustering on the genes by optimizing a joint probability model, characterizing gene coregulation between multiple experiments. We compute the model using a two-stage Expectation-Maximization-type algorithm, first fixing the cross-experiment covariance structure and using efficient Bayesian hierarchical clustering to obtain a locally optimal clustering of the gene expression profiles and then, conditional on that clustering, carrying out Bayesian inference on the cross-experiment covariance using Markov chain Monte Carlo simulation to obtain an expectation.
View Article and Find Full Text PDFA Bayesian nonlinear hierarchical random coefficients model was used in a reanalysis of a previously published longitudinal study of the extracellular direct current (DC)-potential and apparent diffusion coefficient (ADC) responses to focal ischaemia. The main purpose was to examine the data for evidence of an ADC threshold for anoxic depolarisation. A Markov chain Monte Carlo simulation approach was adopted.
View Article and Find Full Text PDFThe vast potential of the genomic insight offered by microarray technologies has led to their widespread use since they were introduced a decade ago. Application areas include gene function discovery, disease diagnosis, and inferring regulatory networks. Microarray experiments enable large-scale, high-throughput investigations of gene activity and have thus provided the data analyst with a distinctive, high-dimensional field of study.
View Article and Find Full Text PDFGranular activated carbon (GAC) adsorption is an effective treatment technology for the removal of synthetic organic chemicals (SOCs) from drinking water supplies. This treatment process can be expensive if not properly designed. Application of mathematical models is an attractive method to evaluate the impact of process variables on process design and performance.
View Article and Find Full Text PDFEnviron Sci Technol
March 2005
TiO2 nanotube semiconductors contain free spaces in their interior that can be filled with active materials such as chemical compounds, enzymes, and noble metals, giving them a fundamental advantage over colloids. Although the unique shape of semiconductor nanotubes makes them promising for a range of potential applications, significant developmental research is required. In this research, a novel TiO2 nanotube photocatalyst was prepared that has a p-n junction.
View Article and Find Full Text PDF