Effectively compressing transmitted images and reducing the distortion of reconstructed images are challenges in image semantic communication. This paper proposes a novel image semantic communication model that integrates a dynamic decision generation network and a generative adversarial network to address these challenges as efficiently as possible. At the transmitter, features are extracted and selected based on the channel's signal-to-noise ratio (SNR) using semantic encoding and a dynamic decision generation network. This semantic approach can effectively compress transmitted images, thereby reducing communication traffic. At the receiver, the generator/decoder collaborates with the discriminator network, enhancing image reconstruction quality through adversarial and perceptual losses. The experimental results on the CIFAR-10 dataset demonstrate that our scheme achieves a peak SNR of 26 dB, a structural similarity of 0.9, and a compression ratio (CR) of 81.5% in an AWGN channel with an SNR of 3 dB. Similarly, in the Rayleigh fading channel, the peak SNR is 23 dB, structural similarity is 0.8, and the CR is 80.5%. The learned perceptual image patch similarity in both channels is below 0.008. These experiments thoroughly demonstrate that the proposed semantic communication is a superior deep learning-based joint source-channel coding method, offering a high CR and low distortion of reconstructed images.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11344051 | PMC |
http://dx.doi.org/10.1038/s41598-024-70619-9 | DOI Listing |
Codas
January 2025
Departamento de Fonoaudiologia, Universidade Federal de Santa Maria - UFSM - Santa Maria (RS), Brasil.
Purpose: This study aimed to adapt the Montreal Cognitive Assessment Hearing Impaired (MoCA-H) into Brazilian Portuguese (BP).
Methods: This was a descriptive, cross-sectional, quantitative, and qualitative study involving participants selected by convenience. The instrument was adapted from its original version, in a six-stage process consisting of the following: Stage 1 - Translation and back translation of the MoCA-H; Stage 2 - Stimulus analysis and selection; Stage 3 - Semantic analysis of stimuli; Stage 4 - Analysis by non-expert judges, part 1; Stage 5 - Analysis by non-expert judges, part 2; Stage 6 - Pilot study.
Front Res Metr Anal
January 2025
Centre for Postgraduate Studies, Cape Peninsula University of Technology, Cape Town, South Africa.
Big Data communication researchers have highlighted the need for qualitative analysis of online science conversations to better understand their meaning. However, a scholarly gap exists in exploring how qualitative methods can be applied to small data regarding micro-bloggers' communications about science articles. While social media attention assists with article dissemination, qualitative research into the associated microblogging practices remains limited.
View Article and Find Full Text PDFElife
January 2025
State Key Laboratory of Cognitive Neuroscience and Learning, Beijing Normal University & IDG/McGovern Institute for Brain Research, Beijing, China.
Speech comprehension involves the dynamic interplay of multiple cognitive processes, from basic sound perception, to linguistic encoding, and finally to complex semantic-conceptual interpretations. How the brain handles the diverse streams of information processing remains poorly understood. Applying Hidden Markov Modeling to fMRI data obtained during spoken narrative comprehension, we reveal that the whole brain networks predominantly oscillate within a tripartite latent state space.
View Article and Find Full Text PDFHum Brain Mapp
February 2025
Université libre de Bruxelles (ULB), UNI - ULB Neuroscience Institute, Laboratoire de Neuroanatomie et Neuroimagerie translationnelles (LN2T), Brussels, Belgium.
Language control processes allow for the flexible manipulation and access to context-appropriate verbal representations. Functional magnetic resonance imaging (fMRI) studies have localized the brain regions involved in language control processes usually by comparing high vs. low lexical-semantic control conditions during verbal tasks.
View Article and Find Full Text PDFSci Rep
January 2025
Key Laboratory of Ethnic Language Intelligent Analysis and Security Governance of MOE, Minzu University of China, Beijing, 100081, China.
Speech-to-speech translation (S2ST) has evolved from cascade systems which integrate Automatic Speech Recognition (ASR), Machine Translation (MT), and Text-to-Speech (TTS), to end-to-end models. This evolution has been driven by advancements in model performance and the expansion of cross-lingual speech datasets. Despite the paucity of research on Tibetan speech translation, this paper endeavors to tackle the challenge of Tibetan-to-Chinese direct speech-to-speech translation within the multi-task learning framework, employing self-supervised learning (SSL) and sequence-to-sequence model training.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!