Paying attention to uncertainty: A stochastic multimodal transformers for post-traumatic stress disorder detection using video.

Comput Methods Programs Biomed

Université Paris-Est Créteil (UPEC), LISSI, 120, Rue Paul Armangot, Vitry-sur-Seine, 94400, France. Electronic address:

Published: December 2024

Background And Objectives: Post-traumatic stress disorder is a debilitating psychological condition that can manifest following exposure to traumatic events. It affects individuals from diverse backgrounds and is associated with various symptoms, including intrusive thoughts, nightmares, hyperarousal, and avoidance behaviors.

Methods: To address this challenge this study proposes a decision support system powered by a novel multimodal deep learning approach, based on a stochastic Transformer and video data. This Transformer has the ability to take advantage of its stochastic activation function and layers that allow it to learn sparse representations of the inputs. The method leverages a combination of low-level features extracted using three modalities, including Mel-frequency cepstral coefficients extracted from audio recordings, Facial Action Units captured from facial expressions, and textual data obtained from the audio transcription. By considering these modalities, our proposed model captures a comprehensive range of information related to post-traumatic stress disorder symptoms, including vocal cues, facial expressions, and linguistic content.

Results: The deep learning model was trained and evaluated on the eDAIC dataset, which consists of clinical interviews with individuals with and without post-traumatic disorder. The model achieved state-of-the-art results, demonstrating its effectiveness in accurately detecting PTSD, showing an impressive Root Mean Square Error of 1.98, and a Concordance Correlation Coefficient of 0.722, signifying the model's superior performance compared to existing approaches.

Conclusion: This work introduces a new method for post-traumatic stress disorder detection from videos by utilizing a multimodal stochastic Transformer model. The model makes use of a variety of modalities, such as text, audio, and visual data, to gather comprehensive and varied information in order to make the detection.

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.cmpb.2024.108439DOI Listing

Publication Analysis

Top Keywords

post-traumatic stress
16
stress disorder
16
disorder detection
8
symptoms including
8
deep learning
8
stochastic transformer
8
facial expressions
8
post-traumatic
5
disorder
5
model
5

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!