Probabilistically sampled and spectrally clustered plant species using phenotypic characteristics.

PeerJ

Network Systems, Cybersecurity and Digital Law Department, Institut Mines-Telecom Atlantique, Rennes, France.

Published: September 2021

Phenotypic characteristics of a plant species refers to its physical properties as cataloged by plant biologists at different research centers around the world. Clustering species based upon their phenotypic characteristics is used to obtain diverse sets of parents that are useful in their breeding programs. The Hierarchical Clustering (HC) algorithm is the current standard in clustering of phenotypic data. This algorithm suffers from low accuracy and high computational complexity issues. To address the accuracy challenge, we propose the use of Spectral Clustering (SC) algorithm. To make the algorithm computationally cheap, we propose using sampling, specifically, Pivotal Sampling that is probability based. Since application of samplings to phenotypic data has not been explored much, for effective comparison, another sampling technique called Vector Quantization (VQ) is adapted for this data as well. VQ has recently generated promising results for genotypic data. The novelty of our SC with Pivotal Sampling algorithm is in constructing the crucial similarity matrix for the clustering algorithm and defining probabilities for the sampling technique. Although our algorithm can be applied to any plant species, we tested it on the phenotypic data obtained from about 2,400 Soybean species. SC with Pivotal Sampling achieves substantially more accuracy (in terms of Silhouette Values) than all the other proposed competitive clustering with sampling algorithms (. SC with VQ, HC with Pivotal Sampling, and HC with VQ). The complexities of our SC with Pivotal Sampling algorithm and these three variants are almost the same because of the involved sampling. In addition to this, SC with Pivotal Sampling outperforms the standard HC algorithm in both accuracy and computational complexity. We experimentally show that we are up to 45% more accurate than HC in terms of clustering accuracy. The computational complexity of our algorithm is more than a magnitude less than that of HC.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8432307PMC
http://dx.doi.org/10.7717/peerj.11927DOI Listing

Publication Analysis

Top Keywords

pivotal sampling
24
plant species
12
phenotypic characteristics
12
clustering algorithm
12
phenotypic data
12
computational complexity
12
sampling
11
algorithm
10
sampling technique
8
sampling algorithm
8

Similar Publications

Background: The gut microbiota plays a pivotal role in ulcerative colitis (UC) development. This study explores the impact of latent tuberculosis infection (LTBI) on the gut microbiota in UC and assesses changes during vedolizumab treatment, investigating prophylactic anti-tuberculosis therapy.

Results: This cohort study included adult patients with UC receiving vedolizumab treatment at Jinhua Hospital, Zhejiang University from April 2021 to December 2022.

View Article and Find Full Text PDF

Biodegradation of plasticizers by novel strains of bacteria isolated from plastic waste near Juhu Beach, Mumbai, India.

Sci Rep

December 2024

Computer Aided Drug Designing and Molecular Modeling Lab, Department of Bioinformatics, Alagappa University, Karaikudi, 630 003, Tamil Nadu, India.

Phthalic acid esters are pivotal plasticizers in various applications, including cosmetics, packaging materials, and medical devices. They have garnered significant attention from the scientific community due to their persistence in ecosystems. The multifaceted aspects of PAEs, encompassing leaching, transformation, and toxicity, underscore their prominence as primary components of anthropogenic waste.

View Article and Find Full Text PDF

CD4 T cells play a pivotal role in the immune system, particularly in adaptive immunity, by orchestrating and enhancing immune responses. CD4 T cell-related immune responses exhibit diverse characteristics in different diseases. This study utilizes gene expression analysis of CD4 T cells to classify and understand complex diseases.

View Article and Find Full Text PDF

Summer profiles: Tracing currently used organophosphorus pollutants in the surface seawater of the Arctic Ocean.

Sci Total Environ

December 2024

Ministry of Natural Resources Key Laboratory for Polar Sciences, Polar Research Institute of China, NO.451, Jinqiao Road, Shanghai, 200136, China; Zhejiang University of Water Resources and Electric Power, NO. 508, Second Avenue, Hangzhou, Zhejiang, 310018, China. Electronic address:

We investigate the spatial distribution and potential ecological impact of Currently Used Organophosphorus Pollutants (CUOPPs) in the Arctic Ocean, focusing on the East Siberian Sea, Laptev Sea, and high Arctic regions. Analyzing surface water samples collected during a scientific expedition aboard the "Xuelong 2" in August and September 2021, we detected 38 out of 83 targeted CUOPPs, including Phorate, Paraoxon, and Azinphos-ethyl, with concentrations exhibiting significant geographical variance. The results reveal a pronounced increase in CUOPP concentrations towards the Arctic poles, diverging markedly from the patterns observed in the East China Sea, thereby highlighting distinct regional pollution profiles and environmental interactions.

View Article and Find Full Text PDF

Diverse analytical techniques are employed to scrutinize microplastics (MPs)─pervasive at hazardous concentrations across diverse sources ranging from water reservoirs to consumable substances. The limitations inherent in existing methods, such as their diminished detection capacities, render them inadequate for analyzing MPs of diminutive dimensions (microplastics: 1-5 μm; nanoplastics: < 1 μm). Consequently, there is an imperative need to devise methodologies that afford improved sensitivity and lower detection limits for analyzing these pollutants.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!