Publications by Gemma Roig | LitMetric

Publications by authors named "Gemma Roig"

Page 1 of 1

Evaluating Medical Image Segmentation Models Using Augmentation.

Mattin Sayed Sari Saba-Sadiya Benedikt Wichtlhuber Julia Dietz Matthias Neitzel Gemma Roig

Tomography

December 2024

Background: Medical imagesegmentation is an essential step in both clinical and research applications, and automated segmentation models-such as TotalSegmentator-have become ubiquitous. However, robust methods for validating the accuracy of these models remain limited, and manual inspection is often necessary before the segmentation masks produced by these models can be used.

Methods: To address this gap, we have developed a novel validation framework for segmentation models, leveraging data augmentation to assess model consistency.

View Article and Find Full Text PDF

Generative adversarial collaborations: a new model of scientific discourse.

Benjamin Peters Gunnar Blohm Ralf Haefner Leyla Isik Nikolaus Kriegeskorte Gemma Roig

Trends Cogn Sci

December 2024

View Article and Find Full Text PDF

Modeling short visual events through the BOLD moments video fMRI dataset and metadata.

Benjamin Lahner Kshitij Dwivedi Polina Iamshchinina Monika Graumann Alex Lascelles Gemma Roig

Nat Commun

July 2024

Studying the neural basis of human dynamic visual perception requires extensive experimental data to evaluate the large swathes of functionally diverse brain neural networks driven by perceiving visual events. Here, we introduce the BOLD Moments Dataset (BMD), a repository of whole-brain fMRI responses to over 1000 short (3 s) naturalistic video clips of visual events across ten human subjects. We use the videos' extensive metadata to show how the brain represents word- and sentence-level descriptions of visual events and identify correlates of video memorability scores extending into the parietal cortex.

View Article and Find Full Text PDF

EMOKINE: A software package and computational framework for scaling up the creation of highly controlled emotional full-body movement datasets.

Julia F Christensen Andrés Fernández Rebecca A Smith Georgios Michalareas Sina H N Yazdi Gemma Roig

Behav Res Methods

October 2024

EMOKINE is a software package and dataset creation suite for emotional full-body movement research in experimental psychology, affective neuroscience, and computer vision. A computational framework, comprehensive instructions, a pilot dataset, observer ratings, and kinematic feature extraction code are provided to facilitate future dataset creations at scale. In addition, the EMOKINE framework outlines how complex sequences of movements may advance emotion research.

View Article and Find Full Text PDF

Visual features are processed before navigational affordances in the human brain.

Kshitij Dwivedi Sari Sadiya Marta P Balode Gemma Roig Radoslaw M Cichy

Sci Rep

March 2024

To navigate through their immediate environment humans process scene information rapidly. How does the cascade of neural processing elicited by scene viewing to facilitate navigational planning unfold over time? To investigate, we recorded human brain responses to visual scenes with electroencephalography and related those to computational models that operationalize three aspects of scene processing (2D, 3D, and semantic information), as well as to a behavioral model capturing navigational affordances. We found a temporal processing hierarchy: navigational affordance is processed later than the other scene features (2D, 3D, and semantic) investigated.

View Article and Find Full Text PDF

LLM Multimodal Traffic Accident Forecasting.

I de Zarzà J de Curtò Gemma Roig Carlos T Calafate

Sensors (Basel)

November 2023

With the rise in traffic congestion in urban centers, predicting accidents has become paramount for city planning and public safety. This work comprehensively studied the efficacy of modern deep learning (DL) methods in forecasting traffic accidents and enhancing Level-4 and Level-5 (L-4 and L-5) driving assistants with actionable visual and language cues. Using a rich dataset detailing accident occurrences, we juxtaposed the Transformer model against traditional time series models like ARIMA and the more recent Prophet model.

View Article and Find Full Text PDF

LLM Adaptive PID Control for B5G Truck Platooning Systems.

I de Zarzà J de Curtò Gemma Roig Carlos T Calafate

Sensors (Basel)

June 2023

This paper presents an exploration into the capabilities of an adaptive PID controller within the realm of truck platooning operations, situating the inquiry within the context of Cognitive Radio and AI-enhanced 5G and Beyond 5G (B5G) networks. We developed a Deep Learning (DL) model that emulates an adaptive PID controller, taking into account the implications of factors such as communication latency, packet loss, and communication range, alongside considerations of reliability, robustness, and security. Furthermore, we harnessed a Large Language Model (LLM), GPT-3.

View Article and Find Full Text PDF

A large and rich EEG dataset for modeling human visual object recognition.

Alessandro T Gifford Kshitij Dwivedi Gemma Roig Radoslaw M Cichy

Neuroimage

December 2022

The human brain achieves visual object recognition through multiple stages of linear and nonlinear transformations operating at a millisecond scale. To predict and explain these rapid transformations, computational neuroscientists employ machine learning modeling techniques. However, state-of-the-art models require massive amounts of data to properly train, and to the present day there is a lack of vast brain datasets which extensively sample the temporal dynamics of visual object recognition.

View Article and Find Full Text PDF

Single Image Video Prediction with Auto-Regressive GANs.

Jiahui Huang Yew Ken Chia Samson Yu Kevin Yee Dennis Küster Gemma Roig

Sensors (Basel)

May 2022

In this paper, we introduce an approach for future frames prediction based on a single input image. Our method is able to generate an entire video sequence based on the information contained in the input frame. We adopt an autoregressive approach in our generation process, i.

View Article and Find Full Text PDF

The spatiotemporal neural dynamics of object location representations in the human brain.

Monika Graumann Caterina Ciuffi Kshitij Dwivedi Gemma Roig Radoslaw M Cichy

Nat Hum Behav

June 2022

To interact with objects in complex environments, we must know what they are and where they are in spite of challenging viewing conditions. Here, we investigated where, how and when representations of object location and category emerge in the human brain when objects appear on cluttered natural scene images using a combination of functional magnetic resonance imaging, electroencephalography and computational models. We found location representations to emerge along the ventral visual stream towards lateral occipital complex, mirrored by gradual emergence in deep neural networks.

View Article and Find Full Text PDF

AttendAffectNet-Emotion Prediction of Movie Viewers Using Multimodal Fusion with Self-Attention.

Ha Thi Phuong Thao B T Balamurali Gemma Roig Dorien Herremans

Sensors (Basel)

December 2021

In this paper, we tackle the problem of predicting the affective responses of movie viewers, based on the content of the movies. Current studies on this topic focus on video representation learning and fusion techniques to combine the extracted features for predicting affect. Yet, these typically, while ignoring the correlation between multiple modality inputs, ignore the correlation between temporal inputs (i.

View Article and Find Full Text PDF

Unveiling functions of the visual cortex using task-specific deep neural networks.

Kshitij Dwivedi Michael F Bonner Radoslaw Martin Cichy Gemma Roig

PLoS Comput Biol

August 2021

The human visual cortex enables visual perception through a cascade of hierarchical computations in cortical regions with distinct functionalities. Here, we introduce an AI-driven approach to discover the functional mapping of the visual cortex. We related human brain responses to scene images measured with functional MRI (fMRI) systematically to a diverse set of deep neural networks (DNNs) optimized to perform different scene perception tasks.

View Article and Find Full Text PDF

Unraveling Representations in Scene-selective Brain Regions Using Scene-Parsing Deep Neural Networks.

Kshitij Dwivedi Radoslaw Martin Cichy Gemma Roig

J Cogn Neurosci

September 2021

Visual scene perception is mediated by a set of cortical regions that respond preferentially to images of scenes, including the occipital place area (OPA) and parahippocampal place area (PPA). However, the differential contribution of OPA and PPA to scene perception remains an open research question. In this study, we take a deep neural network (DNN)-based computational approach to investigate the differences in OPA and PPA function.

View Article and Find Full Text PDF

Scale and translation-invariance for novel objects in human vision.

Yena Han Gemma Roig Gad Geiger Tomaso Poggio

Sci Rep

January 2020

Though the range of invariance in recognition of novel objects is a basic aspect of human vision, its characterization has remained surprisingly elusive. Here we report tolerance to scale and position changes in one-shot learning by measuring recognition accuracy of Korean letters presented in a flash to non-Korean subjects who had no previous experience with Korean letters. We found that humans have significant scale-invariance after only a single exposure to a novel object.

View Article and Find Full Text PDF

Multiphase and Multivariable Linear Controllers That Account for the Joint Torques in Normal Human Walking.

Evrim S Altinkaynak Gemma Roig David J Braun

IEEE Trans Biomed Eng

June 2020

Objective: The objective of this paper is to investigate whether a small number of sequentially composed multivariable linear controllers can be used to recover a defining relation between the joint torques, angles, and velocities hidden in the walking data of multiple human subjects.

Methods: We solve a mixed integer programming problem that defines the optimal multivariable and multiphase relation between the torques, angles, and velocities for the hip, knee, and ankle joints.

Results: Using the data of seven healthy subjects, we show that the aforementioned relation can be remarkably well represented by four sequentially composed and independently activated multivariable linear controllers; the controllers account for [Formula: see text] (mean ± sem) of the variance in the joint torques across subjects, and [Formula: see text] of the variance for a new subject.

View Article and Find Full Text PDF

Learning to Predict Sequences of Human Visual Fixations.

Ming Jiang Xavier Boix Gemma Roig Juan Xu Luc Van Gool

IEEE Trans Neural Netw Learn Syst

June 2016

Most state-of-the-art visual attention models estimate the probability distribution of fixating the eyes in a location of the image, the so-called saliency maps. Yet, these models do not predict the temporal sequence of eye fixations, which may be valuable for better predicting the human eye fixations, as well as for understanding the role of the different cues during visual exploration. In this paper, we present a method for predicting the sequence of human eye fixations, which is learned from the recorded human eye-tracking data.

View Article and Find Full Text PDF