SAR modeling of unbalanced data sets.

SAR QSAR Environ Res

Department of Environmental and Occupational Health, Graduate School of Public Health, University of Pittsburgh, 111 Parran Hall, 130 DeSoto Street, Pittsburgh, PA 15261, USA.

Published: February 2002

The increased acceptance of SAR approaches to hazard identification has led us to investigate methods to improve the predictive performance of SAR models. In the present study we demonstrate that although on theoretical grounds the ratio of active to inactive chemicals in the learning set should be unity, SAR models can "tolerate" an unbalanced range in ratios from 3:1 (i.e., 75% actives) to 1:2 (i.e., 33% actives) and still perform adequately. On the other hand SAR models derived from learning sets with ratios in excess of 4:1 (80% actives), even when corrected for the initial ratio do not perform satisfactorily.

Download full-text PDF

Source
http://dx.doi.org/10.1080/10629360108032916DOI Listing

Publication Analysis

Top Keywords

sar models
12
sar
5
sar modeling
4
modeling unbalanced
4
unbalanced data
4
data sets
4
sets increased
4
increased acceptance
4
acceptance sar
4
sar approaches
4

Similar Publications

Significant Impact of a Daytime Halogen Oxidant on Coastal Air Quality.

Environ Sci Technol

January 2025

Department of Civil and Environmental Engineering, The Hong Kong Polytechnic University, Hong Kong SAR 999077, China.

Chlorine radicals (Cl) are highly reactive and affect the fate of air pollutants. Several field studies in China have revealed elevated levels of daytime molecular chlorine (Cl), which, upon photolysis, release substantial amounts of Cl but are poorly represented in current chemical transport models. Here, we implemented a parametrization for the formation of daytime Cl through the photodissociation of particulate nitrate in acidic environments into a regional model and assessed its impact on coastal air quality during autumn in South China.

View Article and Find Full Text PDF

To investigate the pattern and threshold of physiological growth, defining as axial length (AL) elongation that results in little refraction progression, among Chinese children and teenagers, a total of 916 children aged between 7 and 18 years from a 6-year longitudinal cohort study were included for analysis. Ocular biometry, cycloplegic refraction and demographic data were obtained annually. Physiological growth was calculated based on myopic progression and Gullstrand eye model, respectively.

View Article and Find Full Text PDF

Integrative Transcriptome-Wide Association Study With Expression Quantitative Trait Loci Colocalization Identifies a Causal VAMP8 Variant for Nasopharyngeal Carcinoma Susceptibility.

Adv Sci (Weinh)

January 2025

State Key Laboratory of Oncology in South China, Guangdong Key Laboratory of Nasopharyngeal Carcinoma Diagnosis and Therapy, Guangdong Provincial Clinical Research Center for Cancer, Sun Yat-sen University Cancer Center, Sun Yat-sen University, Guangzhou, 510060, P. R. China.

Nasopharyngeal carcinoma (NPC) is an Asia-prevalent malignancy, yet its genetic underpinnings remain incompletely understood. Here, a transcriptome-wide association study (TWAS) is conducted on NPC, leveraging gene expression prediction models based on epithelial tissues and genome-wide association study (GWAS) summary statistics from 1577 NPC cases and 6359 controls of southern Chinese descent. The TWAS identifies VAMP8 on chromosome 2p11.

View Article and Find Full Text PDF

This study introduces EpiAgePublic, a new method to estimate biological age using only three specific sites on the gene known for its connection to aging. Unlike traditional methods that require complex and extensive data, our model uses a simpler approach that is well-suited for next-generation sequencing technology, which is a more advanced method of analyzing DNA methylation. This new model overcomes some of the common challenges found in older methods, such as errors due to sample quality and processing variations.

View Article and Find Full Text PDF

Efficient Generative-Adversarial U-Net for Multi-Organ Medical Image Segmentation.

J Imaging

January 2025

School of Computer Science and Artificial Intelligence, Changzhou University, Changzhou 213000, China.

Manual labeling of lesions in medical image analysis presents a significant challenge due to its labor-intensive and inefficient nature, which ultimately strains essential medical resources and impedes the advancement of computer-aided diagnosis. This paper introduces a novel medical image-segmentation framework named Efficient Generative-Adversarial U-Net (EGAUNet), designed to facilitate rapid and accurate multi-organ labeling. To enhance the model's capability to comprehend spatial information, we propose the Global Spatial-Channel Attention Mechanism (GSCA).

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!