A novel image semantic communication method via dynamic decision generation network and generative adversarial network.

Shugang Liu Zhan Peng Qiangguo Yu Linan Duan

Sci Rep

School of Physics and Electronic Science, Hunan University of Science and Technology, Xiangtan, 411201, China.

Published: August 2024

Effectively compressing transmitted images and reducing the distortion of reconstructed images are challenges in image semantic communication. This paper proposes a novel image semantic communication model that integrates a dynamic decision generation network and a generative adversarial network to address these challenges as efficiently as possible. At the transmitter, features are extracted and selected based on the channel's signal-to-noise ratio (SNR) using semantic encoding and a dynamic decision generation network. This semantic approach can effectively compress transmitted images, thereby reducing communication traffic. At the receiver, the generator/decoder collaborates with the discriminator network, enhancing image reconstruction quality through adversarial and perceptual losses. The experimental results on the CIFAR-10 dataset demonstrate that our scheme achieves a peak SNR of 26 dB, a structural similarity of 0.9, and a compression ratio (CR) of 81.5% in an AWGN channel with an SNR of 3 dB. Similarly, in the Rayleigh fading channel, the peak SNR is 23 dB, structural similarity is 0.8, and the CR is 80.5%. The learned perceptual image patch similarity in both channels is below 0.008. These experiments thoroughly demonstrate that the proposed semantic communication is a superior deep learning-based joint source-channel coding method, offering a high CR and low distortion of reconstructed images.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11344051	PMC
http://dx.doi.org/10.1038/s41598-024-70619-9	DOI Listing

Publication Analysis

Top Keywords

semantic communication

image semantic

dynamic decision

decision generation

generation network

novel image

network generative

generative adversarial

adversarial network

transmitted images

Similar Publications

Montreal Cognitive Assessment Hearing Impairment (MoCA-H): cross-cultural adaptation to Brazilian Portuguese.

Codas

January 2025

Departamento de Fonoaudiologia, Universidade Federal de Santa Maria - UFSM - Santa Maria (RS), Brasil.

Rochele Martins Machado Karina Carlesso Pagliarin Fernanda Soares Aurélio Patatt

Purpose: This study aimed to adapt the Montreal Cognitive Assessment Hearing Impaired (MoCA-H) into Brazilian Portuguese (BP).

Methods: This was a descriptive, cross-sectional, quantitative, and qualitative study involving participants selected by convenience. The instrument was adapted from its original version, in a six-stage process consisting of the following: Stage 1 - Translation and back translation of the MoCA-H; Stage 2 - Stimulus analysis and selection; Stage 3 - Semantic analysis of stimuli; Stage 4 - Analysis by non-expert judges, part 1; Stage 5 - Analysis by non-expert judges, part 2; Stage 6 - Pilot study.

View Article and Find Full Text PDF

Similar Publications

A role for qualitative methods in researching Twitter data on a popular science article's communication.

Front Res Metr Anal

January 2025

Centre for Postgraduate Studies, Cape Peninsula University of Technology, Cape Town, South Africa.

Travis Noakes Corrie Susanna Uys Patricia Ann Harpur Izak van Zyl

Big Data communication researchers have highlighted the need for qualitative analysis of online science conversations to better understand their meaning. However, a scholarly gap exists in exploring how qualitative methods can be applied to small data regarding micro-bloggers' communications about science articles. While social media attention assists with article dissemination, qualitative research into the associated microblogging practices remains limited.

View Article and Find Full Text PDF

Similar Publications

Tripartite organization of brain state dynamics underlying spoken narrative comprehension.

Elife

January 2025

State Key Laboratory of Cognitive Neuroscience and Learning, Beijing Normal University & IDG/McGovern Institute for Brain Research, Beijing, China.

Lanfang Liu Jiahao Jiang Hehui Li Guosheng Ding

Speech comprehension involves the dynamic interplay of multiple cognitive processes, from basic sound perception, to linguistic encoding, and finally to complex semantic-conceptual interpretations. How the brain handles the diverse streams of information processing remains poorly understood. Applying Hidden Markov Modeling to fMRI data obtained during spoken narrative comprehension, we reveal that the whole brain networks predominantly oscillate within a tripartite latent state space.

View Article and Find Full Text PDF

Similar Publications

Investigating the Spatio-Temporal Signatures of Language Control-Related Brain Synchronization Processes.

Hum Brain Mapp

February 2025

Université libre de Bruxelles (ULB), UNI - ULB Neuroscience Institute, Laboratoire de Neuroanatomie et Neuroimagerie translationnelles (LN2T), Brussels, Belgium.

Alexandru Mihai Dumitrescu Tim Coolen Vincent Wens Antonin Rovai Nicola Trotta

Language control processes allow for the flexible manipulation and access to context-appropriate verbal representations. Functional magnetic resonance imaging (fMRI) studies have localized the brain regions involved in language control processes usually by comparing high vs. low lexical-semantic control conditions during verbal tasks.

View Article and Find Full Text PDF

Similar Publications

Tibetan-Chinese speech-to-speech translation based on discrete units.

Sci Rep

January 2025

Key Laboratory of Ethnic Language Intelligent Analysis and Security Governance of MOE, Minzu University of China, Beijing, 100081, China.

Zairan Gong Xiaona Xu Yue Zhao

Speech-to-speech translation (S2ST) has evolved from cascade systems which integrate Automatic Speech Recognition (ASR), Machine Translation (MT), and Text-to-Speech (TTS), to end-to-end models. This evolution has been driven by advancements in model performance and the expansion of cross-lingual speech datasets. Despite the paucity of research on Tibetan speech translation, this paper endeavors to tackle the challenge of Tibetan-to-Chinese direct speech-to-speech translation within the multi-task learning framework, employing self-supervised learning (SSL) and sequence-to-sequence model training.

View Article and Find Full Text PDF

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!