The goal of pedestrian trajectory retrieval is to infer the multi-camera path of a targeted pedestrian using images or videos from a camera network, which is crucial for passenger flow analytics and individual pedestrian retrieval. Conventional approaches hinge on spatiotemporal modeling, necessitating the gathering of positional information for each camera and trajectory data between every camera pair for the training phase. To mitigate these stringent requirements, our proposed methodology employs solely temporal information for modeling. Specifically, we introduce an Implicit Trajectory Encoding scheme, dubbed Temporal Rotary Position Embedding (T-RoPE), which integrates the temporal aspects of within-camera tracklets directly into their visual representations, thereby shaping a novel feature space. Our analysis reveals that, within this refined feature space, the challenge of inter-camera trajectory extraction can be effectively addressed by delineating a linear trajectory manifold. The visual characteristics gleaned from each candidate trajectory are utilized to compare and rank against the query feature, culminating in the ultimate trajectory retrieval outcome. To validate our method, we collected a new pedestrian trajectory dataset from a multi-storey shopping mall, namely the Mall Trajectory Dataset. Extensive experimentation across diverse datasets has demonstrated the versatility of our T-RoPE module as a plug-and-play enhancement to various network architectures, significantly enhancing the precision of pedestrian trajectory retrieval tasks. The dataset and code are released at https://github.com/zhangxin1995/MTD.

Download full-text PDF

Source
http://dx.doi.org/10.1109/TIP.2025.3544494DOI Listing

Publication Analysis

Top Keywords

pedestrian trajectory
16
trajectory retrieval
16
trajectory
12
linear trajectory
8
feature space
8
trajectory dataset
8
retrieval
5
pedestrian
5
cross-camera pedestrian
4
retrieval based
4

Similar Publications

Modeling crash avoidance behaviors in vehicle-pedestrian near-miss scenarios: Curvilinear time-to-collision and Mamba-driven deep reinforcement learning.

Accid Anal Prev

May 2025

Inner Mongolia Center for Transportation Research, Inner Mongolia University, Rm A357A, Transportation Building, South Campus,49 S Xilin Rd, Hohhot, Inner Mongolia 010020, China.

Interactions between vehicle-pedestrian at intersections often lead to safety-critical situations. This study aims to model the crash avoidance behaviors of vehicles during interactions with pedestrians in near-miss scenarios, contributing to the development of collision avoidance systems and safety-aware traffic simulations. Unmanned aerial vehicles were leveraged to collect high-resolution trajectory data of vehicle-pedestrian at urban intersections.

View Article and Find Full Text PDF

Inertial navigation is advancing rapidly due to improvements in sensor technology and tracking algorithms, with consumer-grade inertial measurement units (IMUs) becoming increasingly compact and affordable. Despite progress in pedestrian dead reckoning (PDR), IMU-based positional tracking still faces significant noise and bias issues. While traditional model-based methods and recent machine learning approaches have been employed to reduce signal drift, error accumulation remains a barrier to long-term system performance.

View Article and Find Full Text PDF

The goal of pedestrian trajectory retrieval is to infer the multi-camera path of a targeted pedestrian using images or videos from a camera network, which is crucial for passenger flow analytics and individual pedestrian retrieval. Conventional approaches hinge on spatiotemporal modeling, necessitating the gathering of positional information for each camera and trajectory data between every camera pair for the training phase. To mitigate these stringent requirements, our proposed methodology employs solely temporal information for modeling.

View Article and Find Full Text PDF

As autonomous driving technology progresses, LiDAR-based 3D object detection has emerged as a fundamental element of environmental perception systems. PointPillars transforms point cloud data into a two-dimensional pseudo-image and employs a 2D CNN for efficient and precise detection. Nevertheless, this approach encounters two primary challenges: (1) the sparsity and disorganization of raw point clouds hinder the model's capacity to capture local features, thus impacting detection accuracy; and (2) existing models struggle to detect small objects within complex environments, particularly regarding orientation estimation.

View Article and Find Full Text PDF

Vehicle-Pedestrian near miss analysis at signalized mid-block crossings.

J Safety Res

December 2024

Department of Civil, Environmental and Construction Engineering, University of Central Florida, Orlando, FL 32816, USA. Electronic address:

Introduction: This study aims to identify the factors related to pedestrian and roadway characteristics that affect vehicle-pedestrian Post Encroachment Time (PET) and Relative Time to Collision (RTTC) under traffic control systems at mid-block pedestrian crossings.

Methodology: A total of 112 h of video data were collected using multiple cameras from Pedestrian Hybrid Beacon (PHB) and Rectangular Rapid Flashing Beacon (RRFB) sites. To extract vehicle and pedestrian trajectories and construct an accurate dataset, where each observation corresponds to a specific timeframe, with a recorded speeds of both vehicles and pedestrians, a self-developed cutting-edge Computer Vision (CV) technology was deployed.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!