This paper presents a comparative study of entropy estimation in a large-alphabet regime. A variety of entropy estimators have been proposed over the years, where each estimator is designed for a different setup with its own strengths and caveats. As a consequence, no estimator is known to be universally better than the others.
View Article and Find Full Text PDFThe paper addresses the problem of distinguishing the leading agents in the group. The problem is considered in the framework of classification problems, where the agents in the group select the items with respect to certain properties. The suggested method of distinguishing the leading agents utilizes the connectivity between the agents and the Rokhlin distance between the subgroups of the agents.
View Article and Find Full Text PDFCurrent global COVID-19 booster scheduling strategies mainly focus on vaccinating high-risk populations at predetermined intervals. However, these strategies overlook key data: the direct insights into individual immunity levels from active serological testing and the indirect information available either through sample-based sero-surveillance, or vital demographic, location, and epidemiological factors. Our research, employing an age-, risk-, and region-structured mathematical model of disease transmission-based on COVID-19 incidence and vaccination data from Israel between 15 May 2020 and 25 October 2021-reveals that a more comprehensive strategy integrating these elements can significantly reduce COVID-19 hospitalizations without increasing existing booster coverage.
View Article and Find Full Text PDFEstimating the entropy of a discrete random variable is a fundamental problem in information theory and related fields. This problem has many applications in various domains, including machine learning, statistics, and data compression. Over the years, a variety of estimation schemes have been suggested.
View Article and Find Full Text PDFThis paper addresses the problem of detecting multiple static and mobile targets by an autonomous mobile agent acting under uncertainty. It is assumed that the agent is able to detect targets at different distances and that the detection includes errors of the first and second types. The goal of the agent is to plan and follow a trajectory that results in the detection of the targets in a minimal time.
View Article and Find Full Text PDFBackground: Contact mixing plays a key role in the spread of COVID-19. Thus, mobility restrictions of varying degrees up to and including nationwide lockdowns have been implemented in over 200 countries. To appropriately target the timing, location, and severity of measures intended to encourage social distancing at a country level, it is essential to predict when and where outbreaks will occur, and how widespread they will be.
View Article and Find Full Text PDFWireless body area networks (WBANs) have strong potential in the field of health monitoring. However, the energy consumption required for accurate monitoring determines the time between battery charges of the wearable sensors, which is a key performance factor (and can be critical in the case of implantable devices). In this paper, we study the inherent trade-off between the power consumption of the sensors and the probability of misclassifying a patient's health state.
View Article and Find Full Text PDFEntropy (Basel)
February 2021
The history of information theory, as a mathematical principle for analyzing data transmission and information communication, was formalized in 1948 with the publication of Claude Shannon's famous paper "A Mathematical Theory of Communication" [...
View Article and Find Full Text PDFProjects are rarely executed exactly as planned. Often, the actual duration of a project's activities differ from the planned duration, resulting in costs stemming from the inaccurate estimation of the activity's completion date. While monitoring a project at various inspection points is pricy, it can lead to a better estimation of the project completion time, hence saving costs.
View Article and Find Full Text PDFThe paper considers the detection of multiple targets by a group of mobile robots that perform under uncertainty. The agents are equipped with sensors with positive and non-negligible probabilities of detecting the targets at different distances. The goal is to define the trajectories of the agents that can lead to the detection of the targets in minimal time.
View Article and Find Full Text PDFIn this paper, we propose a comprehensive analytics framework that can serve as a decision support tool for HR recruiters in real-world settings in order to improve hiring and placement decisions. The proposed framework follows two main phases: a local prediction scheme for recruitments' success at the level of a single job placement, and a mathematical model that provides a global recruitment optimization scheme for the organization, taking into account multilevel considerations. In the first phase, a key property of the proposed prediction approach is the interpretability of the machine learning (ML) model, which in this case is obtained by applying the Variable-Order Bayesian Network (VOBN) model to the recruitment data.
View Article and Find Full Text PDFWe propose a new algorithm called the context-based predictive information (CBPI) for estimating the predictive information (PI) between time series, by utilizing a lossy compression algorithm. The advantage of this approach over existing methods resides in the case of sparse predictive information (SPI) conditions, where the ratio between the number of informative sequences to uninformative sequences is small. It is shown that the CBPI achieves a better PI estimation than benchmark methods by ignoring uninformative sequences while improving explainability by identifying the informative sequences.
View Article and Find Full Text PDFVariable order Markov models and variable order Bayesian trees have been proposed for the recognition of cis-regulatory elements, and it has been demonstrated that they outperform traditional models such as position weight matrices, Markov models, and Bayesian trees for the recognition of binding sites in prokaryotes. Here, we study to which degree variable order models can improve the recognition of eukaryotic cis-regulatory elements. We find that variable order models can improve the recognition of binding sites of all the studied transcription factors.
View Article and Find Full Text PDFBMC Bioinformatics
March 2007
Background: The definition of a distance measure plays a key role in the evaluation of different clustering solutions of gene expression profiles. In this empirical study we compare different clustering solutions when using the Mutual Information (MI) measure versus the use of the well known Euclidean distance and Pearson correlation coefficient.
Results: Relying on several public gene expression datasets, we evaluate the homogeneity and separation scores of different clustering solutions.
Variable order Markov models and variable order Bayesian trees have been proposed for the recognition of transcription factor binding sites, and it could be demonstrated that they outperform traditional models, such as position weight matrices, Markov models and Bayesian trees. We develop a web server for the recognition of DNA binding sites based on variable order Markov models and variable order Bayesian trees offering the following functionality: (i) given datasets with annotated binding sites and genomic background sequences, variable order Markov models and variable order Bayesian trees can be trained; (ii) given a set of trained models, putative DNA binding sites can be predicted in a given set of genomic sequences and (iii) given a dataset with annotated binding sites and a dataset with genomic background sequences, cross-validation experiments for different model combinations with different parameter settings can be performed. Several of the offered services are computationally demanding, such as genome-wide predictions of DNA binding sites in mammalian genomes or sets of 10(4)-fold cross-validation experiments for different model combinations based on problem-specific data sets.
View Article and Find Full Text PDF