Large-Scale Coarse-to-Fine Object Retrieval Ontology and Deep Local Multitask Learning.

Comput Intell Neurosci

Department of Information Technology, VNUHCM-University of Science, HCM 70000, Vietnam.

Published: January 2020

Object retrieval plays an increasingly important role in video surveillance, digital marketing, e-commerce, etc. It is facing challenges such as large-scale datasets, imbalanced data, viewpoint, cluster background, and fine-grained details (attributes). This paper has proposed a model to integrate object ontology, a local multitask deep neural network (local MDNN), and an imbalanced data solver to take advantages and overcome the shortcomings of deep learning network models to improve the performance of the large-scale object retrieval system from the coarse-grained level (categories) to the fine-grained level (attributes). Our proposed coarse-to-fine object retrieval (CFOR) system can be robust and resistant to the challenges listed above. To the best of our knowledge, the new main point of our CFOR system is the power of mutual support of object ontology, a local MDNN, and an imbalanced data solver in a unified system. Object ontology supports the exploitation of the inner-group correlations to improve the system performance in category classification, attribute classification, and conducting training flow and retrieval flow to save computational costs in the training stage and retrieval stage on large-scale datasets, respectively. A local MDNN supports linking object ontology to the raw data, and an imbalanced data solver based on Matthews' correlation coefficient (MCC) addresses that the imbalance of data has contributed effectively to increasing the quality of object ontology realization without adjusting network architecture and data augmentation. In order to evaluate the performance of the CFOR system, we experimented on the DeepFashion dataset. This paper has shown that our local MDNN framework based on the pretrained NASNet architecture has achieved better performance (14.2% higher in recall rate) compared to single-task learning (STL) in the attribute learning task; it has also shown that our model with an imbalanced data solver has achieved better performance (5.14% higher in recall rate for fewer data attributes) compared to models that do not take this into account. Moreover, MAP@30 hovers 0.815 in retrieval on an average of 35 imbalanced fashion attributes.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6668564	PMC
http://dx.doi.org/10.1155/2019/1483294	DOI Listing

Publication Analysis

Top Keywords

imbalanced data

object ontology

object retrieval

local mdnn

data solver

cfor system

object

data

coarse-to-fine object

local multitask

Similar Publications

A comparison of methods for modeling soundscape dimensions based on different datasetsa).

J Acoust Soc Am

January 2025

Audio Communication Group, Technische Universität Berlin, Einsteinufer 17c, Berlin, 10587 Germany.

Siegbert Versümer Patrick Blättermann Fabian Rosenthal Stefan Weinzierl

Soundscape studies vary considerably in study design, statistical methods, and model fit metrics used. Due to this confounding of data and methods, it is difficult to assess the suitability of statistical modelling techniques used in the literature. Therefore, five different methods and two performance metrics were applied to three existing soundscape datasets to model soundscape Pleasantness and Eventfulness based on seven acoustic and three sociodemographic predictors.

View Article and Find Full Text PDF

Similar Publications

Comprehensive Breslow thickness (BT)-based analysis to identify biological mechanisms associated with melanoma pathogenesis.

Int Immunopharmacol

January 2025

Department of Dermatology, Affiliated Hospital of Nanjing University of Chinese Medicine, Jiangsu Province Hospital of Chinese Medicine, Nanjing, Jiangsu 210029, China. Electronic address:

Yuan-Jie Liu Qing Liu Jia-Qi Li Qian-Wen Ye Sheng-Yan Yin

Breslow thickness (BT), a parameter measuring the depth of invasion of abnormally proliferating melanocytes, is a key indicator of melanoma severity and prognosis. However, the mechanisms underlying the increase in BT remain elusive. Utilizing data from The Cancer Genome Atlas (TCGA) human skin cutaneous melanoma (SKCM), we identified a set of BT-related molecules and analyzed their expression and genomic heterogeneity across pan-cancerous and normal tissues.

View Article and Find Full Text PDF

Similar Publications

A graph neural network-based model with out-of-distribution robustness for enhancing antiretroviral therapy outcome prediction for HIV-1.

Comput Med Imaging Graph

January 2025

Sapienza University of Rome, Department of Computer Control and Management Engineering Antonio Ruberti, 00185, Rome, Italy. Electronic address:

Giulia Di Teodoro Federico Siciliano Valerio Guarrasi Anne-Mieke Vandamme Valeria Ghisetti

Predicting the outcome of antiretroviral therapies (ART) for HIV-1 is a pressing clinical challenge, especially when the ART includes drugs with limited effectiveness data. This scarcity of data can arise either due to the introduction of a new drug to the market or due to limited use in clinical settings, resulting in clinical dataset with highly unbalanced therapy representation. To tackle this issue, we introduce a novel joint fusion model, which combines features from a Fully Connected (FC) Neural Network and a Graph Neural Network (GNN) in a multi-modality fashion.

View Article and Find Full Text PDF

Similar Publications

Validation, bias assessment, and optimization of the UNAFIED 2-year risk prediction model for undiagnosed atrial fibrillation using national electronic health data.

Heart Rhythm O2

December 2024

Pfizer Inc, New York, New York.

Mohammad Ateya Danai Aristeridou George H Sands Jessica Zielinski Randall W Grout

Background: Prediction models for atrial fibrillation (AF) may enable earlier detection and guideline-directed treatment decisions. However, model bias may lead to inaccurate predictions and unintended consequences.

Objective: The purpose of this study was to validate, assess bias, and improve generalizability of "UNAFIED-10," a 2-year, 10-variable predictive model of undiagnosed AF in a national data set (originally developed using the Indiana Network for Patient Care regional data).

View Article and Find Full Text PDF

Similar Publications

Cooking loss estimation of semispinalis capitis muscle of pork butt using a deep neural network on hyperspectral data.

Meat Sci

January 2025

Department of Biosystems Machinery Engineering, Chungnam National University, Daejeon 34134, Republic of Korea. Electronic address:

Kyung Jo Seonmin Lee Seul-Ki-Chan Jeong Hyeun Bum Kim Pil Nam Seong

This study evaluated the performance of a deep-learning-based model that predicted cooking loss in the semispinalis capitis (SC) muscle of pork butts using hyperspectral images captured 24 h postmortem. To overcome low-scale samples, 70 pork butts were used with pixel-based data augmentation. Principal component regression (PCR) and partial least squares regression (PLSR) models for predicting cooking loss in SC muscle showed higher R values with multiplicative signal correction, while the first derivative resulted in a lower root mean square error (RMSE).

View Article and Find Full Text PDF

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!