Synthesizing photo-realistic images based on text descriptions is a challenging task in the field of computer vision. Although generative adversarial networks have made significant breakthroughs in this task, they still face huge challenges in generating high-quality visually realistic images consistent with the semantics of text. Generally, existing text-to-image methods accomplish this task with two steps, that is, first generating an initial image with a rough outline and color, and then gradually yielding the image within high-resolution from the initial image. However, one drawback of these methods is that, if the quality of the initial image generation is not high, it is hard to generate a satisfactory high-resolution image. In this paper, we propose SAM-GAN, Self-Attention supporting Multi-stage Generative Adversarial Networks, for text-to-image synthesis. With the self-attention mechanism, the model can establish the multi-level dependence of the image and fuse the sentence- and word-level visual-semantic vectors, to improve the quality of the generated image. Furthermore, a multi-stage perceptual loss is introduced to enhance the semantic similarity between the synthesized image and the real image, thus enhancing the visual-semantic consistency between text and images. For the diversity of the generated images, a mode seeking regularization term is integrated into the model. The results of extensive experiments and ablation studies, which were conducted in the Caltech-UCSD Birds and Microsoft Common Objects in Context datasets, show that our model is superior to competitive models in text-to-image synthesis.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1016/j.neunet.2021.01.023 | DOI Listing |
Med Biol Eng Comput
January 2025
School of Automation and Information Engineering, Sichuan University of Science & Engineering, Key Laboratory of Artificial Intelligence, Yibin, 644000, Sichuan, China.
Accurately classifying optical coherence tomography (OCT) images is essential for diagnosing and treating ophthalmic diseases. This paper introduces a novel generative adversarial network framework called MGR-GAN. The masked image modeling (MIM) method is integrated into the GAN model's generator, enhancing its ability to synthesize more realistic images by reconstructing them based on unmasked patches.
View Article and Find Full Text PDFSensors (Basel)
January 2025
Department of Electrical and Electronic Engineering, Imperial College London, London SW7 2AZ, UK.
A generative adversarial network (GAN) makes it possible to map a data sample from one domain to another one. It has extensively been employed in image-to-image and text-to image translation. We propose an EEG-to-EEG translation model to map the scalp-mounted EEG (scEEG) sensor signals to intracranial EEG (iEEG) sensor signals recorded by foramen ovale sensors inserted into the brain.
View Article and Find Full Text PDFSensors (Basel)
January 2025
Department of Electrical Engineering, American University of Sharjah, Sharjah 26666, United Arab Emirates.
Accurately identifying and discriminating between different brain states is a major emphasis of functional brain imaging research. Various machine learning techniques play an important role in this regard. However, when working with a small number of study participants, the lack of sufficient data and achieving meaningful classification results remain a challenge.
View Article and Find Full Text PDFMaterials (Basel)
January 2025
Hubei Key Laboratory of Plasma Chemistry and Advanced Materials, School of Materials Science and Engineering, Wuhan Institute of Technology, Wuhan 430205, China.
The grain size of metal materials has a significant impact on their macroscopic properties. However, original metallographic images often suffer from issues such as substantial noise, missing grain boundaries, low contrast, and blurred edges. These challenges hinder the accurate extraction of complete grain boundaries, limiting the precision of grain size measurement and material performance prediction.
View Article and Find Full Text PDFPLoS One
January 2025
School of Information Science and Engineering, Xinjiang University, Urumqi, China.
Anomaly detection is crucial in areas such as financial fraud identification, cybersecurity defense, and health monitoring, as it directly affects the accuracy and security of decision-making. Existing generative adversarial nets (GANs)-based anomaly detection methods overlook the importance of local density, limiting their effectiveness in detecting anomaly objects in complex data distributions. To address this challenge, we introduce a generative adversarial local density-based anomaly detection (GALD) method, which combines the data distribution modeling capabilities of GANs with local synthetic density analysis.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!