BLAINDER-A Blender AI Add-On for Generation of Semantically Labeled Depth-Sensing Data.

Sensors (Basel)

Virtual Reality and Multimedia Group, Institute of Computer Science, Freiberg University of Mining and Technology, 09599 Freiberg, Germany.

Published: March 2021

Common Machine-Learning (ML) approaches for scene classification require a large amount of training data. However, for classification of depth sensor data, in contrast to image data, relatively few databases are publicly available and manual generation of semantically labeled 3D point clouds is an even more time-consuming task. To simplify the training data generation process for a wide range of domains, we have developed the add-on package for the open-source 3D modeling software Blender, which enables a largely automated generation of semantically annotated point-cloud data in virtual 3D environments. In this paper, we focus on classical depth-sensing techniques Light Detection and Ranging (LiDAR) and Sound Navigation and Ranging (Sonar). Within the add-on, different depth sensors can be loaded from presets, customized sensors can be implemented and different environmental conditions (e.g., influence of rain, dust) can be simulated. The semantically labeled data can be exported to various 2D and 3D formats and are thus optimized for different ML applications and visualizations. In addition, semantically labeled images can be exported using the rendering functionalities of Blender.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8003152PMC
http://dx.doi.org/10.3390/s21062144DOI Listing

Publication Analysis

Top Keywords

semantically labeled
16
generation semantically
12
training data
8
data
7
semantically
5
blainder-a blender
4
blender add-on
4
generation
4
add-on generation
4
labeled
4

Similar Publications

Named Entity Recognition (NER) is an essential component of numerous Natural Language Processing (NLP) systems, with the aim of identifying and classifying entities that have specific meanings in raw text, such as person (PER), location (LOC), and organization (ORG). Recently, Deep Neural Networks (DNNs) have been extensively applied to NER tasks owing to the rapid development of deep learning technology. However, despite their advancements, these models fail to take full advantage of the multi-level features (e.

View Article and Find Full Text PDF

Supporting vision-language model few-shot inference with confounder-pruned knowledge prompt.

Neural Netw

January 2025

National Key Laboratory of Space Integrated Information System, Institute of Software Chinese Academy of Sciences, Beijing, China; University of Chinese Academy of Sciences, Beijing, China.

Vision-language models are pre-trained by aligning image-text pairs in a common space to deal with open-set visual concepts. Recent works adopt fixed or learnable prompts, i.e.

View Article and Find Full Text PDF

Instance segmentation of surgical instruments is a long-standing research problem, crucial for the development of many applications for computer-assisted surgery. This problem is commonly tackled via fully-supervised training of deep learning models, requiring expensive pixel-level annotations to train. In this work, we develop a framework for instance segmentation not relying on spatial annotations for training.

View Article and Find Full Text PDF

Background: Segmentation models for clinical data experience severe performance degradation when trained on a single client from one domain and distributed to other clients from different domain. Federated Learning (FL) provides a solution by enabling multi-party collaborative learning without compromising the confidentiality of clients' private data.

Methods: In this paper, we propose a cross-domain FL method for Weakly Supervised Semantic Segmentation (FL-W3S) of white blood cells in microscopic images.

View Article and Find Full Text PDF

This study introduces a novel AI-driven approach to support elderly patients in Thailand with medication management, focusing on accurate drug label interpretation. Two model architectures were explored: a Two-Stage Optical Character Recognition (OCR) and Large Language Model (LLM) pipeline combining EasyOCR with Qwen2-72b-instruct and a Uni-Stage Visual Question Answering (VQA) model using Qwen2-72b-VL. Both models operated in a zero-shot capacity, utilizing Retrieval-Augmented Generation (RAG) with DrugBank references to ensure contextual relevance and accuracy.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!