Effectively compressing transmitted images and reducing the distortion of reconstructed images are challenges in image semantic communication. This paper proposes a novel image semantic communication model that integrates a dynamic decision generation network and a generative adversarial network to address these challenges as efficiently as possible. At the transmitter, features are extracted and selected based on the channel's signal-to-noise ratio (SNR) using semantic encoding and a dynamic decision generation network. This semantic approach can effectively compress transmitted images, thereby reducing communication traffic. At the receiver, the generator/decoder collaborates with the discriminator network, enhancing image reconstruction quality through adversarial and perceptual losses. The experimental results on the CIFAR-10 dataset demonstrate that our scheme achieves a peak SNR of 26 dB, a structural similarity of 0.9, and a compression ratio (CR) of 81.5% in an AWGN channel with an SNR of 3 dB. Similarly, in the Rayleigh fading channel, the peak SNR is 23 dB, structural similarity is 0.8, and the CR is 80.5%. The learned perceptual image patch similarity in both channels is below 0.008. These experiments thoroughly demonstrate that the proposed semantic communication is a superior deep learning-based joint source-channel coding method, offering a high CR and low distortion of reconstructed images.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11344051PMC
http://dx.doi.org/10.1038/s41598-024-70619-9DOI Listing

Publication Analysis

Top Keywords

semantic communication
16
image semantic
12
dynamic decision
12
decision generation
12
generation network
12
novel image
8
network generative
8
generative adversarial
8
adversarial network
8
transmitted images
8

Similar Publications

Purpose: This study aimed to adapt the Montreal Cognitive Assessment Hearing Impaired (MoCA-H) into Brazilian Portuguese (BP).

Methods: This was a descriptive, cross-sectional, quantitative, and qualitative study involving participants selected by convenience. The instrument was adapted from its original version, in a six-stage process consisting of the following: Stage 1 - Translation and back translation of the MoCA-H; Stage 2 - Stimulus analysis and selection; Stage 3 - Semantic analysis of stimuli; Stage 4 - Analysis by non-expert judges, part 1; Stage 5 - Analysis by non-expert judges, part 2; Stage 6 - Pilot study.

View Article and Find Full Text PDF

Big Data communication researchers have highlighted the need for qualitative analysis of online science conversations to better understand their meaning. However, a scholarly gap exists in exploring how qualitative methods can be applied to small data regarding micro-bloggers' communications about science articles. While social media attention assists with article dissemination, qualitative research into the associated microblogging practices remains limited.

View Article and Find Full Text PDF

Speech comprehension involves the dynamic interplay of multiple cognitive processes, from basic sound perception, to linguistic encoding, and finally to complex semantic-conceptual interpretations. How the brain handles the diverse streams of information processing remains poorly understood. Applying Hidden Markov Modeling to fMRI data obtained during spoken narrative comprehension, we reveal that the whole brain networks predominantly oscillate within a tripartite latent state space.

View Article and Find Full Text PDF

Investigating the Spatio-Temporal Signatures of Language Control-Related Brain Synchronization Processes.

Hum Brain Mapp

February 2025

Université libre de Bruxelles (ULB), UNI - ULB Neuroscience Institute, Laboratoire de Neuroanatomie et Neuroimagerie translationnelles (LN2T), Brussels, Belgium.

Language control processes allow for the flexible manipulation and access to context-appropriate verbal representations. Functional magnetic resonance imaging (fMRI) studies have localized the brain regions involved in language control processes usually by comparing high vs. low lexical-semantic control conditions during verbal tasks.

View Article and Find Full Text PDF

Tibetan-Chinese speech-to-speech translation based on discrete units.

Sci Rep

January 2025

Key Laboratory of Ethnic Language Intelligent Analysis and Security Governance of MOE, Minzu University of China, Beijing, 100081, China.

Speech-to-speech translation (S2ST) has evolved from cascade systems which integrate Automatic Speech Recognition (ASR), Machine Translation (MT), and Text-to-Speech (TTS), to end-to-end models. This evolution has been driven by advancements in model performance and the expansion of cross-lingual speech datasets. Despite the paucity of research on Tibetan speech translation, this paper endeavors to tackle the challenge of Tibetan-to-Chinese direct speech-to-speech translation within the multi-task learning framework, employing self-supervised learning (SSL) and sequence-to-sequence model training.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!