Camera traps offer enormous new opportunities in ecological studies, but current automated image analysis methods often lack the contextual richness needed to support impactful conservation outcomes. Integrating vision-language models into these workflows could address this gap by providing enhanced contextual understanding and enabling advanced queries across temporal and spatial dimensions. Here, we present an integrated approach that combines deep learning-based vision and language models to improve ecological reporting using data from camera traps. We introduce a two-stage system: YOLOv10-X to localise and classify species (mammals and birds) within images and a Phi-3.5-vision-instruct model to read YOLOv10-X bounding box labels to identify species, overcoming its limitation with hard-to-classify objects in images. Additionally, Phi-3.5 detects broader variables, such as vegetation type and time of day, providing rich ecological and environmental context to YOLO's species detection output. When combined, this output is processed by the model's natural language system to answer complex queries, and retrieval-augmented generation (RAG) is employed to enrich responses with external information, like species weight and IUCN status (information that cannot be obtained through direct visual analysis). Combined, this information is used to automatically generate structured reports, providing biodiversity stakeholders with deeper insights into, for example, species abundance, distribution, animal behaviour, and habitat selection. Our approach delivers contextually rich narratives that aid in wildlife management decisions. By providing contextually rich insights, our approach not only reduces manual effort but also supports timely decision making in conservation, potentially shifting efforts from reactive to proactive.

Download full-text PDF

Source
http://dx.doi.org/10.3390/s24248122DOI Listing

Publication Analysis

Top Keywords

data camera
8
camera traps
8
contextually rich
8
species
5
context-rich automated
4
automated biodiversity
4
biodiversity assessments
4
assessments deriving
4
deriving ai-powered
4
ai-powered insights
4

Similar Publications

Three-dimensional characterization of nanoporous membranes by capillary filling using high speed interferometry.

Rev Sci Instrum

January 2025

Dpto. de Física, Facultad de Ingeniería Química, Universidad Nacional del Litoral, S3000 Santa Fe, Argentina and Instituto de Física del Litoral, Santa Fe S3000, Argentina.

A high-speed interferometric system was developed to analyze nanostructured porous silicon (PS) membranes by measuring reflectance variations during capillary filling from both sides. A high-speed camera was employed to capture the reflectance evolution of the entire sample area with the necessary temporal resolution, providing quantitative information on filling dynamics. By integrating these data with a simple fluid dynamic model, it is possible to examine the internal structure of the membranes and determine the effective pore radii profiles along their thickness.

View Article and Find Full Text PDF

Despite the substantial contact police have with survivors of violence against women, empirical accounts of survivors' perceptions of police use of body-worn cameras (BWCs) are limited. This study examines survivors' concerns with BWCs. We present qualitative data from semi-structured interviews with 33 survivors of intimate partner abuse and sexual assault.

View Article and Find Full Text PDF

Multi-modal systems extract information about the environment using specialized sensors that are optimized based on the wavelength of the phenomenology and material interactions. To maximize the entropy, complementary systems operating in regions of non-overlapping wavelengths are optimal. VIS-IR (Visible-Infrared) systems have been at the forefront of multi-modal fusion research and are used extensively to represent information in all-day all-weather applications.

View Article and Find Full Text PDF

The perception of the vehicle's environment is crucial for automated vehicles. Therefore, environmental sensors' reliability and correct functioning are becoming increasingly important. Current vehicle inspections and self-diagnostics must be adapted to ensure the correct functioning of environmental sensors throughout the vehicle's lifetime.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!