Ecological research, just as all Earth System Sciences, is becoming increasingly data-rich. Tools for processing of "big data" are continuously developed to meet corresponding technical and logistical challenges. However, even at smaller scales, data sets may be challenging when best practices in data exploration, quality control and reproducibility are to be met. This can occur when conventional methods, such as generating and assessing diagnostic visualizations or tables, become unfeasible due to time and practicality constraints. Interactive processing can alleviate this issue, and is increasingly utilized to ensure that large data sets are diligently handled. However, recent interactive tools rarely enable data manipulation, may not generate reproducible outputs, or are typically data/domain-specific. We developed datacleanr, an interactive tool that facilitates best practices in data exploration, quality control (e.g., outlier assessment) and flexible processing for multiple tabular data types, including time series and georeferenced data. The package is open-source, and based on the R programming language. A key functionality of datacleanr is the "reproducible recipe"-a translation of all interactive actions into R code, which can be integrated into existing analyses pipelines. This enables researchers experienced with script-based workflows to utilize the strengths of interactive processing without sacrificing their usual work style or functionalities from other (R) packages. We demonstrate the package's utility by addressing two common issues during data analyses, namely 1) identifying problematic structures and artefacts in hierarchically nested data, and 2) preventing excessive loss of data from 'coarse,' code-based filtering of time series. Ultimately, with datacleanr we aim to improve researchers' workflows and increase confidence in and reproducibility of their results.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9098071PMC
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0268426PLOS

Publication Analysis

Top Keywords

data
11
data sets
8
best practices
8
practices data
8
data exploration
8
exploration quality
8
quality control
8
interactive processing
8
time series
8
processing
5

Similar Publications

Because of high mutation rates, viruses constantly adapt to new environments. When propagated in cell lines, certain viruses acquire positively charged amino acids on their surface proteins, enabling them to utilize negatively charged heparan sulfate (HS) as an attachment receptor. In this study, we used enterovirus A71 (EV-A71) as model and demonstrated that unlike the parental MP4 variant, the cell-adapted strong HS-binder MP4-97R/167G does not require acidification for uncoating and releases its genome in the neutral or weakly acidic environment of early endosomes.

View Article and Find Full Text PDF

The rise of big data and the internet of things has driven the demand for multimodal sensing and high-efficiency low-latency processing. Inspired by the human sensory system, we present a multifunctional optoelectronic-memristor-based reservoir computing (OM-RC) system by utilizing a CuSCN/PbS quantum dots (QDs) heterojunction. The OM-RC system exhibits volatile and nonlinear responses to electrical signals and wide-spectrum optical stimuli covering ultraviolet, visible, and near-infrared (NIR) regions, enabling multitask processing of dynamic signals.

View Article and Find Full Text PDF

The most damaging disease affecting citrus globally is Huanglongbing (HLB), primarily attributed to the infection by ' asiaticus' (Las). Based on comparative transcriptome data, two cellulose synthase (CESA) genes responsive to Las infection induction were screened, and one gene cloned with higher differential expression level was selected and named . we verified the interaction between CsCESA1 and citrus exopolysaccharide 2 (CsEPS2) proteins.

View Article and Find Full Text PDF

Making Video Games More Inclusive for People Living With Motor Neuron Disease: Scoping Review.

JMIR Rehabil Assist Technol

December 2024

College of Arts, Business, Law, Education and IT, Victoria University, Footscray Park, Australia.

Background: Evidence suggests that individuals with motor neuron disease (MND), a terminal illness, find enjoyment and social connection through video games. However, MND-related barriers can make gaming challenging, exacerbating feelings of boredom, stress, isolation, and loss of control over daily life.

Objective: We scoped the evidence to describe relevant research and practice regarding what may help reduce difficulties for people with MND when playing video games.

View Article and Find Full Text PDF

: To explore the potential of gatekeeping for specialized consultations and patient care via remote interactions with family physicians. This cross-sectional study was conducted at a tertiary hospital between November 2020 and December 2021, when specialized consultations were canceled due to the COVID-19 pandemic. Patients who were evaluated for remote consultation with family physicians were included.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!