AI Article Synopsis

  • The study focuses on classifying wheat varieties using near infrared (NIR) spectroscopy and emphasizes the challenge of balancing the modeling sample size with information redundancy.
  • It introduces a new sample selection method called k nearest neighbor-density, which is an improvement over traditional random sampling and k nearest neighbor methods.
  • Experimental results show that using k nearest neighbor-density not only reduces the sample size significantly but also enhances the overall accuracy of the classification models for wheat varieties.

Article Abstract

For more wheat varieties classification problem, we use near infrared spectrumto do qualitative analysis. Increasing the size of modeling sample could increase information of the model, however, at the same time, it also makes information redundancy so that modeling time and storage space will increase, thus, we need to decrease the size of modeling sample though selecting them. Some information must be lost and the effects of the model must be worse if we select samples blindly. We put forward the k nearest neighbor-density sample selection based on the traditional selection methods. Experiments use the near infrared diffuse reflection spectrum of wheat seed from lots of days. First, we use preprocessing and feature extraction to deal with the wheat original spectrum, then select modeling sample by three methods that are random sampling, k nearest neighbor and k nearest neighbor-density, finally, we establish the models of BPR(Biomimetic Pattern Recognition) and BPRI(Biomimetic Pattern Recognition Improved). The experimental results show that in the model of BPR we get the best results using the selection method of k nearest neighbor-density, especially it also decreases the size of modeling sample deeply, and in the model of BPRI the results using the selection method of k nearest neighbor-density are much better than random sampling and a little better than k nearest neighbor, but in the meanwhile the size of modeling sample using the selection method of k nearest neighbor-density are much smaller than k nearest neighbor. The experimental results prove that the sample selection method of k nearest neighbor-density can not only greatly reduce the modeling sample size, and ensure the quality of the model, it has obvious effect on varieties classification problem of wheat.

Download full-text PDF

Source

Publication Analysis

Top Keywords

modeling sample
24
nearest neighbor-density
24
selection method
20
size modeling
16
method nearest
16
sample selection
12
nearest neighbor
12
nearest
9
varieties classification
8
classification problem
8

Similar Publications

Comprehensive Analysis Reveals the Potential Diagnostic Value of Biomarkers Associated With Aging and Circadian Rhythm in Knee Osteoarthritis.

Orthop Surg

January 2025

Department of Orthopedics, Tianjin Medical University General Hospital, International Science and Technology Cooperation Base of Spinal Cord Injury, Tianjin Key Laboratory of Spine and Spinal Cord, Tianjin, China.

Objective: Knee osteoarthritis (KOA) is characterized by structural changes. Aging is a major risk factor for KOA. Therefore, the objective of this study was to examine the role of genes related to aging and circadian rhythms in KOA.

View Article and Find Full Text PDF

Impeding linear calibration models from accurately predicting target sample analyte amounts are the target sample-wise deviations in measurement profiles (e.g., spectra) relative to calibration samples.

View Article and Find Full Text PDF

Semiparametric estimator for the covariate-specific receiver operating characteristic curve.

Stat Methods Med Res

January 2025

CITMAga and Department of Statistics and Operations Research, Universidade de Vigo, Vigo, Galicia, Spain.

The study of the predictive ability of a marker is mainly based on the accuracy measures provided by the so-called confusion matrix. Besides, the area under the receiver operating characteristic curve has become a popular index for summarizing the overall accuracy of a marker. However, the nature of the relationship between the marker and the outcome, and the role that potential confounders play in this relationship could be fundamental in order to extrapolate the observed results.

View Article and Find Full Text PDF

Clinical trials (CTs) often suffer from small sample sizes due to limited budgets and patient enrollment challenges. Using historical data for the CT data analysis may boost statistical power and reduce the required sample size. Existing methods on borrowing information from historical data with right-censored outcomes did not consider matching between historical data and CT data to reduce the heterogeneity.

View Article and Find Full Text PDF

Objective: The aim of this study is to explore the risk profiles associated with Abdominal aortic aneurysm (AAA) incidence in both the general population and diverse subpopulations.

Summary Background Data: AAA is a life-threatening arterial disease, and there is limited understanding of its etiological spectrum across the age, sex, and genetic risk subgroups, making early prevention efforts more complicated.

Methods: This study encompassed a sample size of 364399 participants from the UK.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!