High-dimensional LASSO (Hi-LASSO) is a powerful feature selection tool for high-dimensional data. Our previous study showed that Hi-LASSO outperformed the other state-of-the-art LASSO methods. However, the substantial cost of bootstrapping and the lack of experiments for a parametric statistical test for feature selection have impeded to apply Hi-LASSO for practical applications. In this paper, the Python package and its Spark library are efficiently designed in a parallel manner for practice with real-world problems, as well as providing the capability of the parametric statistical tests for feature selection on high-dimensional data. We demonstrate Hi-LASSO's outperformance with various intensive experiments in a practical manner. Hi-LASSO will be efficiently and easily performed by using the packages for feature selection. Hi-LASSO packages are publicly available at https://github.com/datax-lab/Hi-LASSO under the MIT license. The packages can be easily installed by Python PIP, and additional documentation is available at https://pypi.org/project/hi-lasso and https://pypi.org/project/Hi-LASSO-spark.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9714948PMC
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0278570PLOS

Publication Analysis

Top Keywords

feature selection
20
high-dimensional data
12
packages feature
8
selection high-dimensional
8
parametric statistical
8
hi-lasso
6
feature
5
selection
5
hi-lasso high-performance
4
high-performance python
4

Similar Publications

To ensure that an eHealth technology fits with its intended users, other stakeholders, and the context within which it will be used, thorough development, implementation, and evaluation processes are necessary. The CeHRes (Centre for eHealth and Wellbeing Research) Roadmap is a framework that can help shape these processes. While it has been successfully used in research and practice, new developments and insights have arisen since the Roadmap's first publication in 2011, not only within the domain of eHealth but also within the different disciplines in which the Roadmap is grounded.

View Article and Find Full Text PDF

Chemical release data are essential for performing chemical risk assessments to understand the potential exposures arising from industrial processes. Often, these data are unknown or unavailable and must be estimated. A case study of volatile organic compound releases during extrusion-based additive manufacturing is used here to explore the viability of various regression methods for predicting chemical releases to inform chemical assessments.

View Article and Find Full Text PDF

The Q-Band Energetics and Relaxation of Chlorophylls and as Revealed by Visible-to-Near Infrared Time-Resolved Absorption Spectroscopy.

J Phys Chem Lett

January 2025

Key Laboratory of Advanced Light Conversion Materials and Biophotonics, School of Chemistry and Life Resources, Renmin University of China, Beijing 100872, P. R. China.

Chlorophyll (Chl) is the most abundant light-harvesting pigment of oxygenic photosynthetic organisms; however, the Q-band energetics and relaxation dynamics remain unclear. In this work, we have applied femtosecond time-resolved (-TA) absorption spectroscopy in 430-1,700 nm to Chls and in diluted pyridine solutions under selective optical excitation within their Q-bands. The results revealed distinct near-infrared absorption features of the B ← Q and B ← Q transitions in 930-1,700 nm, which together with the steady-state absorption in 400-700 nm unveiled the Q-state energy that lies 1,000 ± 400 and 600 ± 400 cm above the Q-state for Chls and , respectively.

View Article and Find Full Text PDF

In this paper, we introduce FUSION-ANN, a novel artificial neural network (ANN) designed for acoustic emission (AE) signal classification. FUSION-ANN comprises four distinct ANN branches, each housing an independent multilayer perceptron. We extract denoised features of speech recognition such as linear predictive coding, Mel-frequency cepstral coefficient, and gammatone cepstral coefficient to represent AE signals.

View Article and Find Full Text PDF

Plasmodium falciparum erythrocyte membrane protein 1 (PfEMP1) is a diverse family of variant surface antigens, encoded by var genes, that mediates binding of infected erythrocytes to human cells and plays a key role in parasite immune evasion and malaria pathology. The increased availability of parasite genome sequence data has revolutionised the study of PfEMP1 diversity across multiple P. falciparum isolates.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!