High-frequency (HF) signals are ubiquitous in the industrial world and are of great use for monitoring of industrial assets. Most deep-learning tools are designed for inputs of fixed and/or very limited size and many successful applications of deep learning to the industrial context use as inputs extracted features, which are a manually and often arduously obtained compact representation of the original signal. In this paper, we propose a fully unsupervised deep-learning framework that is able to extract a meaningful and sparse representation of raw HF signals. We embed in our architecture important properties of the fast discrete wavelet transform (FDWT) such as 1) the cascade algorithm; 2) the conjugate quadrature filter property that links together the wavelet, the scaling, and transposed filter functions; and 3) the coefficient denoising. Using deep learning, we make this architecture fully learnable: Both the wavelet bases and the wavelet coefficient denoising become learnable. To achieve this objective, we propose an activation function that performs a learnable hard thresholding of the wavelet coefficients. With our framework, the denoising FDWT becomes a fully learnable unsupervised tool that does not require any type of pre- or postprocessing or any prior knowledge on wavelet transform. We demonstrate the benefits of embedding all these properties on three machine-learning tasks performed on open-source sound datasets. We perform an ablation study of the impact of each property on the performance of the architecture, achieve results well above baseline, and outperform other state-of-the-art methods.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8872732 | PMC |
http://dx.doi.org/10.1073/pnas.2106598119 | DOI Listing |
Sensors (Basel)
December 2024
School of Geosciences and Info-Physics, Central South University, Changsha 410083, China.
Traditional multimodal contrastive learning brings text and its corresponding image closer together as a positive pair, where the text typically consists of fixed sentence structures or specific descriptive statements, and the image features are generally global features (with some fine-grained work using local features). Similar to unimodal self-supervised contrastive learning, this approach can be seen as enforcing a strict identity constraint in a multimodal context. However, due to the inherent complexity of remote sensing images, which cannot be easily described in a single sentence, and the fact that remote sensing images contain rich ancillary information beyond just object features, this strict identity constraint may be insufficient.
View Article and Find Full Text PDFMany optical applications require accurate control over a beam's spatial intensity profile, in particular, achieving uniform irradiance across a target area can be critically important for nonlinear optical processes such as laser machining. This paper introduces a novel control algorithm for Digital Micromirror Devices (DMDs) that simultaneously and adaptively modulates both the intensity and the spatial intensity profile of an incident beam with random and intricate intensity variations in a single step. The algorithm treats each micromirror within the DMD as an independent Bernoulli distribution characterized by a learnable parameter.
View Article and Find Full Text PDFArtif Intell Med
December 2024
School of Computer Science and Engineering, UNSW, Sydney, Australia.
Accurate segmentation of skin lesions within dermoscopic images plays a crucial role in the timely identification of skin cancer for computer-aided diagnosis on mobile platforms. However, varying shapes of the lesions, lack of defined edges, and the presence of obstructions such as hair strands and marker colours make this challenge more complex. Additionally, skin lesions often exhibit subtle variations in texture and colour that are difficult to differentiate from surrounding healthy skin, necessitating models that can capture both fine-grained details and broader contextual information.
View Article and Find Full Text PDFJ Med Imaging (Bellingham)
November 2024
Johns Hopkins University, Department of Electrical and Computer Engineering, Baltimore, Maryland, United States.
Comput Med Imaging Graph
December 2024
Christian Doppler Laboratory for Artificial Intelligence in Retina, Department of Ophthalmology and Optometry, Medical University of Vienna, Austria; Institute of Artificial Intelligence, Center for Medical Data Science, Medical University of Vienna, Austria.
Background And Objective: Despite the promising capabilities of 3D transformer architectures in video analysis, their application to high-resolution 3D medical volumes encounters several challenges. One major limitation is the high number of 3D patches, which reduces the efficiency of the global self-attention mechanisms of transformers. Additionally, background information can distract vision transformers from focusing on crucial areas of the input image, thereby introducing noise into the final representation.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!