Large-Scale Coarse-to-Fine Object Retrieval Ontology and Deep Local Multitask Learning.

Comput Intell Neurosci

Department of Information Technology, VNUHCM-University of Science, HCM 70000, Vietnam.

Published: January 2020

Object retrieval plays an increasingly important role in video surveillance, digital marketing, e-commerce, etc. It is facing challenges such as large-scale datasets, imbalanced data, viewpoint, cluster background, and fine-grained details (attributes). This paper has proposed a model to integrate object ontology, a local multitask deep neural network (local MDNN), and an imbalanced data solver to take advantages and overcome the shortcomings of deep learning network models to improve the performance of the large-scale object retrieval system from the coarse-grained level (categories) to the fine-grained level (attributes). Our proposed coarse-to-fine object retrieval (CFOR) system can be robust and resistant to the challenges listed above. To the best of our knowledge, the new main point of our CFOR system is the power of mutual support of object ontology, a local MDNN, and an imbalanced data solver in a unified system. Object ontology supports the exploitation of the inner-group correlations to improve the system performance in category classification, attribute classification, and conducting training flow and retrieval flow to save computational costs in the training stage and retrieval stage on large-scale datasets, respectively. A local MDNN supports linking object ontology to the raw data, and an imbalanced data solver based on Matthews' correlation coefficient (MCC) addresses that the imbalance of data has contributed effectively to increasing the quality of object ontology realization without adjusting network architecture and data augmentation. In order to evaluate the performance of the CFOR system, we experimented on the DeepFashion dataset. This paper has shown that our local MDNN framework based on the pretrained NASNet architecture has achieved better performance (14.2% higher in recall rate) compared to single-task learning (STL) in the attribute learning task; it has also shown that our model with an imbalanced data solver has achieved better performance (5.14% higher in recall rate for fewer data attributes) compared to models that do not take this into account. Moreover, MAP@30 hovers 0.815 in retrieval on an average of 35 imbalanced fashion attributes.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6668564PMC
http://dx.doi.org/10.1155/2019/1483294DOI Listing

Publication Analysis

Top Keywords

imbalanced data
20
object ontology
20
object retrieval
16
local mdnn
16
data solver
16
cfor system
12
object
9
data
9
coarse-to-fine object
8
local multitask
8

Similar Publications

Soundscape studies vary considerably in study design, statistical methods, and model fit metrics used. Due to this confounding of data and methods, it is difficult to assess the suitability of statistical modelling techniques used in the literature. Therefore, five different methods and two performance metrics were applied to three existing soundscape datasets to model soundscape Pleasantness and Eventfulness based on seven acoustic and three sociodemographic predictors.

View Article and Find Full Text PDF

Comprehensive Breslow thickness (BT)-based analysis to identify biological mechanisms associated with melanoma pathogenesis.

Int Immunopharmacol

January 2025

Department of Dermatology, Affiliated Hospital of Nanjing University of Chinese Medicine, Jiangsu Province Hospital of Chinese Medicine, Nanjing, Jiangsu 210029, China. Electronic address:

Breslow thickness (BT), a parameter measuring the depth of invasion of abnormally proliferating melanocytes, is a key indicator of melanoma severity and prognosis. However, the mechanisms underlying the increase in BT remain elusive. Utilizing data from The Cancer Genome Atlas (TCGA) human skin cutaneous melanoma (SKCM), we identified a set of BT-related molecules and analyzed their expression and genomic heterogeneity across pan-cancerous and normal tissues.

View Article and Find Full Text PDF

Predicting the outcome of antiretroviral therapies (ART) for HIV-1 is a pressing clinical challenge, especially when the ART includes drugs with limited effectiveness data. This scarcity of data can arise either due to the introduction of a new drug to the market or due to limited use in clinical settings, resulting in clinical dataset with highly unbalanced therapy representation. To tackle this issue, we introduce a novel joint fusion model, which combines features from a Fully Connected (FC) Neural Network and a Graph Neural Network (GNN) in a multi-modality fashion.

View Article and Find Full Text PDF

Background: Prediction models for atrial fibrillation (AF) may enable earlier detection and guideline-directed treatment decisions. However, model bias may lead to inaccurate predictions and unintended consequences.

Objective: The purpose of this study was to validate, assess bias, and improve generalizability of "UNAFIED-10," a 2-year, 10-variable predictive model of undiagnosed AF in a national data set (originally developed using the Indiana Network for Patient Care regional data).

View Article and Find Full Text PDF

This study evaluated the performance of a deep-learning-based model that predicted cooking loss in the semispinalis capitis (SC) muscle of pork butts using hyperspectral images captured 24 h postmortem. To overcome low-scale samples, 70 pork butts were used with pixel-based data augmentation. Principal component regression (PCR) and partial least squares regression (PLSR) models for predicting cooking loss in SC muscle showed higher R values with multiplicative signal correction, while the first derivative resulted in a lower root mean square error (RMSE).

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!