Bagging Nearest-Neighbor Prediction independence Test: an efficient method for nonlinear dependence of two continuous variables.

Sci Rep

State Key Laboratory of Genetic Engineering, Collaborative Innovation Center for Genetics and Development, School of Life Sciences, Fudan University, Shanghai, China.

Published: October 2017

Testing dependence/correlation of two variables is one of the fundamental tasks in statistics. In this work, we proposed an efficient method for nonlinear dependence of two continuous variables (X and Y). We addressed this research question by using BNNPT (Bagging Nearest-Neighbor Prediction independence Test, software available at https://sourceforge.net/projects/bnnpt/). In the BNNPT framework, we first used the value of X to construct a bagging neighborhood structure. We then obtained the out of bag estimator of Y based on the bagging neighborhood structure. The square error was calculated to measure how well Y is predicted by X. Finally, a permutation test was applied to determine the significance of the observed square error. To evaluate the strength of BNNPT compared to seven other methods, we performed extensive simulations to explore the relationship between various methods and compared the false positive rates and statistical power using both simulated and real datasets (Rugao longevity cohort mitochondrial DNA haplogroups and kidney cancer RNA-seq datasets). We concluded that BNNPT is an efficient computational approach to test nonlinear correlation in real world applications.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5630623PMC
http://dx.doi.org/10.1038/s41598-017-12783-9DOI Listing

Publication Analysis

Top Keywords

bagging nearest-neighbor
8
nearest-neighbor prediction
8
prediction independence
8
independence test
8
efficient method
8
method nonlinear
8
nonlinear dependence
8
dependence continuous
8
continuous variables
8
bagging neighborhood
8

Similar Publications

A comprehensive machine learning-based models for predicting mixture toxicity of azole fungicides toward algae (Auxenochlorella pyrenoidosa).

Environ Int

December 2024

Guangxi Key Laboratory of Theory and Technology for Environmental Pollution Control, Guilin University of Technology, Guilin 541006, China; Collaborative Innovation Center for Water Pollution Control and Water Safety in Karst Area, Guilin University of Technology, Guilin 541006, China. Electronic address:

Quantitative structure-activity relationships (QSARs) have been used to predict mixture toxicity. However, current research faces gaps in achieving accurate predictions of the mixture toxicity of azole fungicides. To address this gap, the application of machine learning (ML) algorithms has emerged as an effective strategy.

View Article and Find Full Text PDF

Revolutionizing cardiovascular disease classification through machine learning and statistical methods.

J Biopharm Stat

November 2024

Centre of Excellence in Natural Products and Therapeutics, Department of Biotechnology and Bioinformatics, Sambalpur University, Jyoti Vihar, Burla, Sambalpur, Odisha, India.

Background: Cardiovascular diseases (CVDs) include abnormal conditions of the heart, diseased blood vessels, structural problems of the heart, and blood clots. Traditionally, CVD has been diagnosed by clinical experts, physicians, and medical specialists, which is expensive, time-consuming, and requires expert intervention. On the other hand, cost-effective digital diagnosis of CVD is now possible because of the emergence of machine learning (ML) and statistical techniques.

View Article and Find Full Text PDF

Ensemble Fusion Models Using Various Strategies and Machine Learning for EEG Classification.

Bioengineering (Basel)

September 2024

Department of Artificial Intelligence Convergence, Hallym University, Chuncheon 24252, Republic of Korea.

Electroencephalography (EEG) helps to assess the electrical activities of the brain so that the neuronal activities of the brain are captured effectively. EEG is used to analyze many neurological disorders, as it serves as a low-cost equipment. To diagnose and treat every neurological disorder, lengthy EEG signals are needed, and different machine learning and deep learning techniques have been developed so that the EEG signals could be classified automatically.

View Article and Find Full Text PDF

Background: Data sciences and artificial intelligence are becoming encouraging tools in assisted reproduction, favored by time-lapse technology incubators. Our objective is to analyze, compare and identify the most predictive machine learning algorithm developed using a known implantation database of embryos transferred in our egg donation program, including morphokinetic and morphological variables, and recognize the most predictive embryo parameters in order to enhance IVF treatments clinical outcomes.

Methods: Multicenter retrospective cohort study carried out in 378 egg donor recipients who performed a fresh single embryo transfer during 2021.

View Article and Find Full Text PDF

The integration of cutting-edge technologies such as the Internet of Things (IoT), robotics, and machine learning (ML) has the potential to significantly enhance the productivity and profitability of traditional fish farming. Farmers using traditional fish farming methods incur enormous economic costs owing to labor-intensive schedule monitoring and care, illnesses, and sudden fish deaths. Another ongoing issue is automated fish species recommendation based on water quality.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!