Entropy is re-examined as a quantification of ignorance in the predictability of a one dimensional continuous phenomenon. Although traditional estimators for entropy have been widely utilized in this context, we show that both the thermodynamic and Shannon's theory of entropy are fundamentally discrete, and that the limiting process used to define differential entropy suffers from similar problems to those encountered in thermodynamics. In contrast, we consider a sampled data set to be observations of microstates (unmeasurable in thermodynamics and nonexistent in Shannon's discrete theory), meaning, in this context, it is the macrostates of the underlying phenomenon that are unknown. To obtain a particular coarse-grained model we define macrostates using quantiles of the sample and define an ignorance density distribution based on the distances between quantiles. The geometric partition entropy is then just the Shannon entropy of this finite distribution. Our measure is more consistent and informative than histogram-binning, especially when applied to complex distributions and those with extreme outliers or under limited sampling. Its computational efficiency and avoidance of negative values can also make it preferable to geometric estimators such as k-nearest neighbors. We suggest applications that are unique to this estimator and illustrate its general utility through an application to time series in the approximation of an ergodic symbolic dynamics from limited observations.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9601703PMC
http://dx.doi.org/10.3390/e24101432DOI Listing

Publication Analysis

Top Keywords

geometric partition
8
partition entropy
8
entropy
7
entropy coarse-graining
4
coarse-graining continuous
4
continuous state
4
state space
4
space entropy
4
entropy re-examined
4
re-examined quantification
4

Similar Publications

Conductivity is an important indicator of the health of aquatic ecosystems. We model large amounts of lake conductivity data collected as part of the United States Environmental Protection Agency's National Lakes Assessment using spatial indexing, a flexible and efficient approach to fitting spatial statistical models to big data sets. Spatial indexing is capable of accommodating various spatial covariance structures as well as features like random effects, geometric anisotropy, partition factors, and non-Euclidean topologies.

View Article and Find Full Text PDF

Robust multi-source geographic entities matching by maximizing geometric and semantic similarity.

Sci Rep

December 2024

Department of Geographic Information System, Chinese Academy of Surveying and mapping, Beijing, 100036, China.

Geographic entity matching is an important means for multi-source spatial data fusion and information association and sharing. Corresponding matching methods have been designed by existing studies for different types of entity data characteristics, such as line and area. However, these approaches are often limited in the generalization ability for matching heterogeneous data from multiple sources and the accuracy for complex pattern matching.

View Article and Find Full Text PDF

A cross-species assessment of in silico prediction methods of steady-state volume of distribution using Simcyp simulators.

J Pharm Sci

December 2024

Certara UK Ltd., Certara Predictive Technologies Division, 1 Concourse Way, Level 2-Acero, Sheffield S1 2BJ, United Kingdom. Electronic address:

Predicting steady-state volume of distribution (V) is a key component of pharmacokinetic predictions and often guided using preclinical data. However, when bottom-up prediction from physiologically-based pharmacokinetic (PBPK) models and observed V misalign in preclinical species, or predicted V from different models varies significantly, no consensus exists for selecting models or preclinical species to improve the prediction. Through systematic analysis of V prediction across rat, dog, monkey, and human, using common methods, a practical strategy for predicting human V, with or without integration of preclinical PK information is warranted.

View Article and Find Full Text PDF

Reliability of wing morphometrics for species identification of human-biting black flies (Diptera: Simuliidae) in Thailand.

Parasit Vectors

December 2024

Parasitology and Entomology Research Cluster (PERC), Department of Parasitology, Faculty of Medicine, Chiang Mai University, Chiang Mai, 50200, Thailand.

Background: Fast and reliable species identification of black flies is essential for research proposes and effective vector control. Besides traditional identification based on morphology, which is usually supplemented with molecular methods, geometric morphometrics (GM) has emerged as a promising tool for identification. Despite its potential, no specific GM techniques have been established for the identification of black fly species.

View Article and Find Full Text PDF

We consider the six-vertex model at its free-fermion point with domain wall boundary conditions, which is equivalent to random domino tilings of the Aztec diamond. We compute the scaling limit of a particular nonlocal correlation function, essentially equivalent to the partition function for the domino tilings of a pentagon-shaped domain, obtained by cutting away a triangular region from a corner of the initial Aztec diamond. We observe a third-order phase transition when the geometric parameters of the obtained pentagonal domain are tuned to have the fifth side exactly tangent to the arctic ellipse of the corresponding initial model.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!