Extracting tag hierarchies.

PLoS One

Statistical and Biological Physics Research Group of HAS, Budapest, Hungary ; Eötvös University, Regional Knowledge Centre, Székesfehervár, Hungary.

Published: September 2014

Tagging items with descriptive annotations or keywords is a very natural way to compress and highlight information about the properties of the given entity. Over the years several methods have been proposed for extracting a hierarchy between the tags for systems with a "flat", egalitarian organization of the tags, which is very common when the tags correspond to free words given by numerous independent people. Here we present a complete framework for automated tag hierarchy extraction based on tag occurrence statistics. Along with proposing new algorithms, we are also introducing different quality measures enabling the detailed comparison of competing approaches from different aspects. Furthermore, we set up a synthetic, computer generated benchmark providing a versatile tool for testing, with a couple of tunable parameters capable of generating a wide range of test beds. Beside the computer generated input we also use real data in our studies, including a biological example with a pre-defined hierarchy between the tags. The encouraging similarity between the pre-defined and reconstructed hierarchy, as well as the seemingly meaningful hierarchies obtained for other real systems indicate that tag hierarchy extraction is a very promising direction for further research with a great potential for practical applications. Tags have become very prevalent nowadays in various online platforms ranging from blogs through scientific publications to protein databases. Furthermore, tagging systems dedicated for voluntary tagging of photos, films, books, etc. with free words are also becoming popular. The emerging large collections of tags associated with different objects are often referred to as folksonomies, highlighting their collaborative origin and the "flat" organization of the tags opposed to traditional hierarchical categorization. Adding a tag hierarchy corresponding to a given folksonomy can very effectively help narrowing or broadening the scope of search. Moreover, recommendation systems could also benefit from a tag hierarchy.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3877228PMC
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0084133PLOS

Publication Analysis

Top Keywords

tag hierarchy
16
hierarchy tags
8
organization tags
8
hierarchy extraction
8
computer generated
8
hierarchy
7
tags
7
tag
5
extracting tag
4
tag hierarchies
4

Similar Publications

The mitochondrial whole genome of Phellinus igniarius was sequenced with the objective of examining the evolutionary relationships amongst related species. The entire mitochondrial genome was assembled using Illumina sequencing technology. The structural annotation and bioinformatics analysis were performed.

View Article and Find Full Text PDF

Introduction: A crowd crush can lead to respiratory arrest and result in multiple mass cardiac arrests (MCAs), which are often classified as Black Tag in disaster triage. Recently, many laypersons have been commonly trained in compression-only cardiopulmonary resuscitation (CPR) without ventilation support in various communities. This study aims to describe the characteristics of bystander CPR administered and the outcomes of MCAs during the Itaewon crowd crush incident.

View Article and Find Full Text PDF

The increasing diversity of single-cell datasets require systematic cell type characterization. Clustering is a critical step in single-cell analysis, heavily influencing downstream analyses. However, current unsupervised clustering algorithms rely on biologically irrelevant parameters that require manual optimization and fail to capture hierarchical relationships between clusters.

View Article and Find Full Text PDF

Understanding spoken language is crucial for conversational agents, with intent detection and slot filling being the primary tasks in natural language understanding (NLU). Enhancing the NLU tasks can lead to an accurate and efficient virtual assistant thereby reducing the need for human intervention and expanding their applicability in other domains. Traditionally, these tasks have been addressed individually, but recent studies have highlighted their interconnection, suggesting better results when solved together.

View Article and Find Full Text PDF

In the human genome, CAG 3' splice sites (3'ss) are more than twice as frequent as TAG 3'ss. The greater abundance of the former has been attributed to a higher probability of exon skipping upon cytosine-to-thymine transitions at intron position -3 (-3C > T) than thymine-to-cytosine variants (-3T > C). However, molecular mechanisms underlying this bias and its clinical impact are poorly understood.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!