Using Word Embeddings to Learn a Better Food Ontology.

Front Artif Intell

Department of Computer Science, University of California at Davis, Davis, CA, United States.

Published: November 2020

Food ontologies require significant effort to create and maintain as they involve manual and time-consuming tasks, often with limited alignment to the underlying food science knowledge. We propose a semi-supervised framework for the automated ontology population from an existing ontology scaffold by using word embeddings. Having applied this on the domain of food and subsequent evaluation against an expert-curated ontology, FoodOn, we observe that the food word embeddings capture the latent relationships and characteristics of foods. The resulting ontology, which utilizes word embeddings trained from the Wikipedia corpus, has an improvement of 89.7% in precision when compared to the expert-curated ontology FoodOn (0.34 vs. 0.18, respectively, value = 2.6 × 10), and it has a 43.6% shorter path distance (hops) between predicted and actual food instances (2.91 vs. 5.16, respectively, value = 4.7 × 10) when compared to other methods. This work demonstrates how high-dimensional representations of food can be used to populate ontologies and paves the way for learning ontologies that integrate contextual information from a variety of sources and types.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7861243PMC
http://dx.doi.org/10.3389/frai.2020.584784DOI Listing

Publication Analysis

Top Keywords

word embeddings
16
expert-curated ontology
8
ontology foodon
8
food
7
ontology
6
word
4
embeddings learn
4
learn better
4
better food
4
food ontology
4

Similar Publications

Native and Non-Native Speakers' Recognition of Chinese Two-Character Words in Audio Sentence Comprehension.

Behav Sci (Basel)

December 2024

College of Chinese Language and Literature, Qufu Normal University, No. 57, Jingxuan Road, Qufu 273165, China.

Two experiments were conducted to examine native and non-native speakers' recognition of Chinese two-character words (2C-words) in the context of audio sentence comprehension. The recording was played of a sentence, in which a collocation composed of a number word, a sortal classifier, and a noun (NCN) was embedded. When the participants were about to hear the noun of the NCN (Noun), the playing stopped, and a target was visually presented, which was the Noun, the character-transposed word of the Noun (NounT), or a control word (NounC), or was a homophone nonword for Noun, NounT, or NounC.

View Article and Find Full Text PDF

TExCNN: Leveraging Pre-Trained Models to Predict Gene Expression from Genomic Sequences.

Genes (Basel)

December 2024

Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun 130012, China.

Background/objectives: Understanding the relationship between DNA sequences and gene expression levels is of significant biological importance. Recent advancements have demonstrated the ability of deep learning to predict gene expression levels directly from genomic data. However, traditional methods are limited by basic word encoding techniques, which fail to capture the inherent features and patterns of DNA sequences.

View Article and Find Full Text PDF
Article Synopsis
  • This study introduces MBConv-CapsNet, a new model for large-scale text classification that combines Mobile Inverted Bottleneck Convolutional Networks and Capsule Networks for better performance.
  • The model effectively processes text by considering local and global information and transforming the original text data into a more efficient feature representation.
  • Experimental results show that MBConv-CapsNet outperforms existing methods in various classification tasks, highlighting its improved ability to manage semantic information while minimizing noise.
View Article and Find Full Text PDF

Alleviating the medical strain: a triage method via cross-domain text classification.

Front Comput Neurosci

December 2024

Department of Radiology, The Affiliated Hospital of Yangzhou University, Yangzhou University, Yangzhou, Jiangsu, China.

It is a universal phenomenon for patients who do not know which clinical department to register in large general hospitals. Although triage nurses can help patients, due to the larger number of patients, they have to stand in a queue for minutes to consult. Recently, there have already been some efforts to devote deep-learning techniques or pre-trained language models (PLMs) to triage recommendations.

View Article and Find Full Text PDF

A large-scale database of Mandarin Chinese word associations from the Small World of Words Project.

Behav Res Methods

December 2024

Key Laboratory of Brain Functional Genomics (MOE & STCSM), Affiliated Mental Health Center (ECNU), Institute of Brain and Education Innovation, School of Psychology and Cognitive Science, East China Normal University, Shanghai, China.

Word associations are among the most direct ways to measure word meaning in human minds, capturing various relationships, even those formed by non-linguistic experiences. Although large-scale word associations exist for Dutch, English, and Spanish, there is a lack of data for Mandarin Chinese, the most widely spoken language from a distinct language family. Here we present the Small World of Words-Zhongwen (Chinese) (SWOW-ZH), a word association dataset of Mandarin Chinese derived from a three-response word association task.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!