Analyzing large volumes of high-dimensional data is an issue of fundamental importance in data science, molecular simulations and beyond. Several approaches work on the assumption that the important content of a dataset belongs to a manifold whose Intrinsic Dimension (ID) is much lower than the crude large number of coordinates. Such manifold is generally twisted and curved; in addition points on it will be non-uniformly distributed: two factors that make the identification of the ID and its exploitation really hard. Here we propose a new ID estimator using only the distance of the first and the second nearest neighbor of each point in the sample. This extreme minimality enables us to reduce the effects of curvature, of density variation, and the resulting computational cost. The ID estimator is theoretically exact in uniformly distributed datasets, and provides consistent measures in general. When used in combination with block analysis, it allows discriminating the relevant dimensions as a function of the block size. This allows estimating the ID even when the data lie on a manifold perturbed by a high-dimensional noise, a situation often encountered in real world data sets. We demonstrate the usefulness of the approach on molecular simulations and image analysis.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5610237PMC
http://dx.doi.org/10.1038/s41598-017-11873-yDOI Listing

Publication Analysis

Top Keywords

intrinsic dimension
8
molecular simulations
8
estimating intrinsic
4
dimension datasets
4
datasets minimal
4
minimal neighborhood
4
neighborhood analyzing
4
analyzing large
4
large volumes
4
volumes high-dimensional
4

Similar Publications

Gaseous Synergistic Self-Assembly and Arraying to Develop Bio-Organic Photocapacitors for Neural Photostimulation.

Adv Sci (Weinh)

January 2025

State Key Laboratory of Fluid Power and Mechatronic Systems, Key Laboratory of Advanced Manufacturing Technology of Zhejiang Province, School of Mechanical Engineering, Zhejiang University, Hangzhou, 310058, China.

Bioinspired supramolecular architectonics is attracting increasing interest due to their flexible organization and multifunctionality. However, state-of-the-art bioinspired architectonics generally take place in solvent-based circumstance, thus leading to achieving precise control over the self-assembly remains challenging. Moreover, the intrinsic difficulty of ordering the bio-organic self-assemblies into stable large-scale arrays in the liquid environment for engineering devices severely restricts their extensive applications.

View Article and Find Full Text PDF

AntiBinder: utilizing bidirectional attention and hybrid encoding for precise antibody-antigen interaction prediction.

Brief Bioinform

November 2024

Research Center for Social Intelligence, Fudan University, Handan Street, Shanghai 200433, China.

Antibodies play a key role in medical diagnostics and therapeutics. Accurately predicting antibody-antigen binding is essential for developing effective treatments. Traditional protein-protein interaction prediction methods often fall short because they do not account for the unique structural and dynamic properties of antibodies and antigens.

View Article and Find Full Text PDF

Indigenous ecological knowledge (IEK) has proven effective in environmental governance, forest management, and sustainable development, yet it is threatened by globalization and rapid social-ecological changes. In southern India, I investigated the engagement of the Kattunaicken community with the forest, particularly through honey collection, to explore the connection between their Indigenous epistemological identity and their role in caring for the forest and its inhabitants. I conducted 48 interviews and accompanied 11 forest walks as part of walking ethnography with male community members, who are primarily involved in honey collection within the Wayanad district of Kerala.

View Article and Find Full Text PDF

Nitrogen Enriched Tröger's Base Polymers of Intrinsic Microporosity for Heterogeneous Catalysis.

ACS Appl Polym Mater

January 2025

Department of Chemistry, Faculty of Science and Engineering, Swansea University, Grove Building, Singleton Park, Swansea SA2 8PP, U.K.

Heterogeneous catalysis is significantly enhanced by the use of highly porous polymers with specific functionalities, such as basic groups, which accelerate reaction rates. Polymers of intrinsic microporosity (PIMs) provide a unique platform for catalytic reactions owing to their high surface areas and customizable pore structures. We herein report a series of Tröger's base polymers (TB-PIMs) with enhanced basicity, achieved through the incorporation of nitrogen-containing groups into their repeat units, such as triazine and triphenylamine.

View Article and Find Full Text PDF

Extreme precipitation is a crucial trigger for soil erosion events in karst regions. However, the existence of a scale effect in suspended sediment characteristics of karst basins and which extreme precipitation variables control this effect remain unclear. To investigate this, we analyzed the scale effect on suspended sediment characteristics using monthly hydrological data from five karst basins of varying scales, consistently monitored from 2012 to 2019.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!