Multi-sample -mixup: richer, more realistic synthetic samples from a -series interpolant.

J Big Data

School of Computing Science, Simon Fraser University, 8888 University Drive, Burnaby, V5A 1S6 Canada.

Published: March 2024

Modern deep learning training procedures rely on model regularization techniques such as data augmentation methods, which generate training samples that increase the diversity of data and richness of label information. A popular recent method, , uses convex combinations of pairs of original samples to generate new samples. However, as we show in our experiments,  can produce undesirable synthetic samples, where the data is sampled off the manifold and can contain incorrect labels. We propose -, a generalization of  with provably and demonstrably desirable properties that allows convex combinations of samples, leading to more realistic and diverse outputs that incorporate information from original samples by using a -series interpolant. We show that, compared to , -  better preserves the intrinsic dimensionality of the original datasets, which is a desirable property for training generalizable models. Furthermore, we show that our implementation of -  is faster than , and extensive evaluation on controlled synthetic and 26 diverse real-world natural and medical image classification datasets shows that -  outperforms , CutMix, and traditional data augmentation techniques. The code will be released at https://github.com/kakumarabhishek/zeta-mixup.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10960781PMC
http://dx.doi.org/10.1186/s40537-024-00898-6DOI Listing

Publication Analysis

Top Keywords

synthetic samples
8
samples -series
8
-series interpolant
8
data augmentation
8
convex combinations
8
original samples
8
samples
7
multi-sample -mixup
4
-mixup richer
4
richer realistic
4

Similar Publications

The origin of life on Earth remains one of the most perplexing challenges in biochemistry. While numerous bottom-up experiments under prebiotic conditions have provided valuable insights into the spontaneous chemical genesis of life, there remains a significant gap in the theoretical understanding of the complex reaction processes involved. In this study, we propose a novel approach using a roto-translationally invariant potential (RTIP) formulated with pristine Cartesian coordinates to facilitate the simulation of chemical reactions.

View Article and Find Full Text PDF

A novel methodology for dataset augmentation in the semantic segmentation of coil-coated surface degradation is presented in this study. Deep convolutional generative adversarial networks (DCGAN) are employed to generate synthetic input-target pairs, which closely resemble real-world data, with the goal of expanding an existing dataset. These augmented datasets are used to train two state-of-the-art models, U-net, and DeepLabV3, for the precise detection of degradation areas around scribes.

View Article and Find Full Text PDF

Applications in engineering biology increasingly share the need to run operations on very large numbers of biological samples. This is a direct consequence of the application of good engineering practices, the limited predictive power of current computational models and the desire to investigate very large design spaces in order to solve the hard, important problems the discipline promises to solve. Automation has been proposed as a key component for running large numbers of operations on biological samples.

View Article and Find Full Text PDF

Preinduction cervical ripening in previous caesarean pregnancy is limited to intracervical Foley catheter. This study is aimed at finding the vaginal birth rates, improvement of Bishop score, and safety of osmotic dilator (Dilapan-S) among women with previous caesarean pregnancy. We conducted this single-group clinical study after the approval of the institute ethics committee, clinical trial registration, and obtaining informed consent.

View Article and Find Full Text PDF

ZnCl-Doped Mesoporous Silica Nanoparticles Prepared via a Simple One-Pot Method for Highly Efficient Nitrate Removal.

Environ Res

December 2024

Integrated Science and Technology Research Center, Faculty of Technology and Environment, Prince of Songkla University, Phuket Campus, Kathu, Phuket 83120 Thailand. Electronic address:

Nitrate is a crucial nutrient in the natural nitrogen cycle. However, human activities have elevated nitrate levels in aquatic ecosystems beyond natural thresholds, posing risks to human health and the environment. In this work, ZnCl-doped mesoporous silica nanoparticles (ZnCl@MSN) were synthesized using a one-pot preparation method, leading to a streamlined process with reduced time and energy consumption.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!