Audio-visual multi-modality driven hybrid feature learning model for crowd analysis and classification.

Math Biosci Eng

Department of Electronics and Communication Engineering, AMC Engineering College, Visvesvaraya Technological University, Belagavi, India.

Published: May 2023

The high pace emergence in advanced software systems, low-cost hardware and decentralized cloud computing technologies have broadened the horizon for vision-based surveillance, monitoring and control. However, complex and inferior feature learning over visual artefacts or video streams, especially under extreme conditions confine majority of the at-hand vision-based crowd analysis and classification systems. Retrieving event-sensitive or crowd-type sensitive spatio-temporal features for the different crowd types under extreme conditions is a highly complex task. Consequently, it results in lower accuracy and hence low reliability that confines existing methods for real-time crowd analysis. Despite numerous efforts in vision-based approaches, the lack of acoustic cues often creates ambiguity in crowd classification. On the other hand, the strategic amalgamation of audio-visual features can enable accurate and reliable crowd analysis and classification. Considering it as motivation, in this research a novel audio-visual multi-modality driven hybrid feature learning model is developed for crowd analysis and classification. In this work, a hybrid feature extraction model was applied to extract deep spatio-temporal features by using Gray-Level Co-occurrence Metrics (GLCM) and AlexNet transferrable learning model. Once extracting the different GLCM features and AlexNet deep features, horizontal concatenation was done to fuse the different feature sets. Similarly, for acoustic feature extraction, the audio samples (from the input video) were processed for static (fixed size) sampling, pre-emphasis, block framing and Hann windowing, followed by acoustic feature extraction like GTCC, GTCC-Delta, GTCC-Delta-Delta, MFCC, Spectral Entropy, Spectral Flux, Spectral Slope and Harmonics to Noise Ratio (HNR). Finally, the extracted audio-visual features were fused to yield a composite multi-modal feature set, which is processed for classification using the random forest ensemble classifier. The multi-class classification yields a crowd-classification accurac12529y of (98.26%), precision (98.89%), sensitivity (94.82%), specificity (95.57%), and F-Measure of 98.84%. The robustness of the proposed multi-modality-based crowd analysis model confirms its suitability towards real-world crowd detection and classification tasks.

Download full-text PDF

Source
http://dx.doi.org/10.3934/mbe.2023558DOI Listing

Publication Analysis

Top Keywords

crowd analysis
24
analysis classification
16
hybrid feature
12
feature learning
12
learning model
12
feature extraction
12
crowd
9
audio-visual multi-modality
8
multi-modality driven
8
driven hybrid
8

Similar Publications

Background: Attention-Deficit Hyperactivity Disorder (ADHD) is a complex disease that negatively impacts the social and academic/occupational activities of children and is more common in boys than in girls.

Methods: This case-control study aimed to assess the association between some environmental risk factors and ADHD among children in Alexandria, Egypt. It was carried out at the outpatient clinics of El Shatby Pediatric University Hospital in Alexandria, Egypt, with 252 children (126 cases and 126 controls).

View Article and Find Full Text PDF

Designs of Charge-Balanced Edge Termination Structures for 3.3 kV SiC Power Devices Using PN Multi-Epitaxial Layers.

Micromachines (Basel)

December 2024

School of Electrical and Electronic Engineering, Pusan National University, Busan 46241, Republic of Korea.

We demonstrated 3.3 kV silicon carbide (SiC) PiN diodes using a trenched ring-assisted junction termination extension (TRA-JTE) with PN multi-epitaxial layers. Multiple P rings and width-modulated multiple trenches were utilized to alleviate electric-field crowding at the edges of the junction to quantitively control the effective charge (Q) in the termination structures.

View Article and Find Full Text PDF

The present pilot study examined effectiveness of a 2-week footbathing intervention on physiological, endocrine, and psychological status in healthy Japanese university students. A total of 51 participants were randomly assigned to a footbathing or normal bathing group. Participants in both groups provided daily free descriptions of their physical and mental states during the intervention period.

View Article and Find Full Text PDF

The crowded bacterial cytoplasm is composed of biomolecules that span several orders of magnitude in size and electrical charge. This complexity has been proposed as the source of the rich spatial organization and apparent anomalous diffusion of intracellular components, although this has not been tested directly. Here, we use biplane microscopy to track the 3D motion of self-assembled bacterial genetically encoded multimeric nanoparticles (bGEMs) with tunable size (20 to 50 nm) and charge (-3,240 to +2,700 e) in live cells.

View Article and Find Full Text PDF

Objectives: Gastric cancer (GC) is one of the most malignant tumors. Mounting studies highlighted gastric cancer stem cells (GCSCs) were responsible for the failure of treatment due to recurrence and drug resistance of advanced GC. However, targeted therapy against GCSC for improving GC prognosis suffered from lack of suitable models and molecular targets in terms of personalized medicine.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!