In the particle detectors at the Large Hadron Collider, hundreds of millions of proton-proton collisions are produced every second. If one could store the whole data stream produced in these collisions, tens of terabytes of data would be written to disk every second. The general-purpose experiments ATLAS and CMS reduce this overwhelming data volume to a sustainable level, by deciding in real-time whether each collision event should be kept for further analysis or be discarded. We introduce a dataset of proton collision events that emulates a typical data stream collected by such a real-time processing system, pre-filtered by requiring the presence of at least one electron or muon. This dataset could be used to develop novel event selection strategies and assess their sensitivity to new phenomena. In particular, we intend to stimulate a community-based effort towards the design of novel algorithms for performing unsupervised new physics detection, customized to fit the bandwidth, latency and computational resource constraints of the real-time event selection system of a typical particle detector.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9070018PMC
http://dx.doi.org/10.1038/s41597-022-01187-8DOI Listing

Publication Analysis

Top Keywords

unsupervised physics
8
physics detection
8
data stream
8
event selection
8
lhc physics
4
physics dataset
4
dataset unsupervised
4
detection 40 mhz
4
40 mhz particle
4
particle detectors
4

Similar Publications

The discrete empirical interpolation method (DEIM) is well-established as a means of performing model order reduction in approximating solutions to differential equations, but it has also more recently demonstrated potential in performing data class detection through subset selection. Leveraging the singular value decomposition for dimension reduction, DEIM uses interpolatory projection to identify the representative rows and/or columns of a data matrix. This approach has been adapted to develop additional algorithms, including a CUR matrix factorization for performing dimension reduction while preserving the interpretability of the data.

View Article and Find Full Text PDF

Mid-infrared photoacoustic microscopy can capture biochemical information without staining. However, the long mid-infrared optical wavelengths make the spatial resolution of photoacoustic microscopy significantly poorer than that of conventional confocal fluorescence microscopy. Here, we demonstrate an explainable deep learning-based unsupervised inter-domain transformation of low-resolution unlabeled mid-infrared photoacoustic microscopy images into confocal-like virtually fluorescence-stained high-resolution images.

View Article and Find Full Text PDF

Automatic Prediction of Molecular Properties Using Substructure Vector Embeddings within a Feature Selection Workflow.

J Chem Inf Model

December 2024

Cavendish Laboratory, Department of Physics, University of Cambridge, J. J. Thomson Avenue, Cambridge CB3 0HE, U.K.

Machine learning (ML) methods provide a pathway to accurately predict molecular properties, leveraging patterns derived from structure-property relationships within materials databases. This approach holds significant importance in drug discovery and materials design, where the rapid, efficient screening of molecules can accelerate the development of new pharmaceuticals and chemical materials for highly specialized target application. Unsupervised and self-supervised learning methods applied to graph-based or geometric models have garnered considerable traction.

View Article and Find Full Text PDF

Group-equivariant neural networks have emerged as an efficient approach to model complex data, using generalized convolutions that respect the relevant symmetries of a system. These techniques have made advances in both the supervised learning tasks for classification and regression, and the unsupervised tasks to generate new data. However, little work has been done in leveraging the symmetry-aware expressive representations that could be extracted from these approaches.

View Article and Find Full Text PDF

DiffMAP-GP: Continuous 2D Diffusion Maps from Particle Trajectories without Data Binning using Gaussian Processes.

Biophys Rep (N Y)

December 2024

Center for Biological Physics, Arizona State University, USA; Department of Physics, Arizona State University, USA; School of Molecular Sciences, Arizona State University. Electronic address:

Diffusion coefficients often vary across regions, such as cellular membranes, and quantifying their variation can provide valuable insight into local membrane properties such as composition and stiffness. Toward quantifying diffusion coefficient spatial maps and uncertainties from particle tracks, we develop a Bayesian framework (DiffMAP-GP) by placing Gaussian Process (GP) priors on the family of candidate maps. For sake of computational efficiency, we leverage inducing point methods on GPs arising from the mathematical structure of the data giving rise to non-conjugate likelihood-prior pairs.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!