Two methods of bootstrap resampling are discussed with log-linear models for count data. The first involves the resampling of observations and the second involves the resampling of Pearson residuals taking into account changes in the distribution of residuals associated with the expected values of counts. The use of both methods is illustrated on two data sets; one data set concerns the number of ear infections of swimmers related to whether they are frequent swimmers or not and three other variables, and the other data set concerns the number of visits to a doctor made in the last 2 weeks related to the age of subjects and 10 other variables. A third data set on the number of marine mammal interactions in different years and fishing areas is also used as an example. In this case only the second bootstrap method can be used because the nature of the data allows the bootstrap resampling of observations to produce sets of data that could not have occurred in practice. Simulation results indicate that the bootstrap results are slightly better than the results from a conventional analysis for the first data set, and much better than the results from a conventional analysis for the second data set, but a conventional analysis works well for the third data set while there are problems with bootstrap analyses.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1080/10543406.2011.607748 | DOI Listing |
PLoS One
January 2025
School of Mathematics and Finance, Hunan University of Humanities, Science and Technology, Loudi, China.
During the iterative process of the progressive iterative approximation, it is necessary to calculate the difference between the current interpolation curve and the corresponding data points, known as the adjustment vector. To achieve more precise adjustments of control points, this paper decomposes the adjustment vector into its coordinate components and introduces a weight for each component. By dynamically adjusting these weights, we can accelerate the convergence of iterations and enhance approximation accuracy.
View Article and Find Full Text PDFmSystems
January 2025
Malopolska Centre of Biotechnology, Jagiellonian University, Kraków, Poland.
Average nucleotide identity (ANI) is a widely used metric to estimate genetic relatedness, especially in microbial species delineation. While ANI calculation has been well optimized for bacteria and closely related viral genomes, accurate estimation of ANI below 80%, particularly in large reference data sets, has been challenging due to a lack of accurate and scalable methods. To bridge this gap, we introduce MANIAC, an efficient computational pipeline optimized for estimating ANI and alignment fraction (AF) in viral genomes with divergence around ANI of 70%.
View Article and Find Full Text PDFJ Chem Inf Model
January 2025
Department of Grain Science and Industry, Kansas State University, Manhattan, Kansas 66506, United States.
Cell-penetrating peptides (CPPs) are short peptides capable of penetrating cell membranes, making them valuable for drug delivery and intracellular targeting. Accurate prediction of CPPs can streamline experimental validation in the lab. This study aims to assess pretrained protein language models (pLMs) for their effectiveness in representing CPPs and develop a reliable model for CPP classification.
View Article and Find Full Text PDFJ Rural Health
January 2025
Muskie School of Public Service, University of Southern Maine, Portland, Maine, USA.
Purpose: To address the extent to which Federally Qualified Health Centers (FQHCs) and independent and provider-based Rural Health Clinics (RHCs) were using telehealth prior to and during the COVID-19 pandemic.
Methods: A nationally representative 5% sample of Medicare Fee-for-Service beneficiaries who used outpatient services at FQHCs and RHCs were identified within the 2019-2021 5% Medicare Limited Data Set Outpatient and Carrier files. Rural-Urban Continuum Codes were used to identify rural-urban clinic locations.
Data Brief
February 2025
Institute for Geography, Leipzig University, Johannisallee 19a, Leipzig, 04103, Germany.
This data set includes the spatial model of the thickness and distribution of fine-grained floodplain deposits in the Leipzig floodplain area. The data set originates from borehole records provided by the Saxon State Office for Environment, Agriculture, and Geology [1]. The data processing involved the categorization of the stratigraphic descriptions of the borehole logs.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!