Numerous methods and pipelines have recently emerged for the automatic extraction of knowledge graphs from documents such as scientific publications and patents. However, adapting these methods to incorporate alternative text sources like micro-blogging posts and news has proven challenging as they struggle to model open-domain entities and relations, typically found in these sources. In this paper, we propose an enhanced information extraction pipeline tailored to the extraction of a knowledge graph comprising open-domain entities from micro-blogging posts on social media platforms. Our pipeline leverages dependency parsing and classifies entity relations in an unsupervised manner through hierarchical clustering over word embeddings. We provide a use case on extracting semantic triples from a corpus of 100 thousand tweets about digital transformation and publicly release the generated knowledge graph. On the same dataset, we conduct two experimental evaluations, showing that the system produces triples with precision over 95% and outperforms similar pipelines of around 5% in terms of precision, while generating a comparatively higher number of triples.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11341344PMC
http://dx.doi.org/10.1016/j.heliyon.2024.e32479DOI Listing

Publication Analysis

Top Keywords

extraction knowledge
12
micro-blogging posts
8
open-domain entities
8
knowledge graph
8
triplétoile extraction
4
knowledge
4
knowledge microblogging
4
microblogging text
4
text numerous
4
numerous methods
4

Similar Publications

Background: Pursuing excellence in healthcare delivery systems is an ongoing process. In this process, continuing medical education (CME) is essential for medical professionals to maintain high standards of patient care. In China, where the healthcare sector is undergoing considerable reforms and faces challenges owing to socioeconomic development and demographic shifts, an effective CME system is vital for general practitioners (GPs).

View Article and Find Full Text PDF

Background: Many countries worldwide face the problem of underdeveloped fundamental movement skills (FMS) in children. Active play (AP) holds significant potential for enhancing children's FMS based on its free-choice and unstructured nature, as well as its ease of implementation and dissemination. Therefore, the primary objective of this systematic review was to determine the effects of AP interventions on FMS in typically developing children.

View Article and Find Full Text PDF

Pectin is a major component of plant cells walls. The extent to which pectin chains crosslink with one another determines crucial properties including cell wall strength, porosity, and the ability of small, biologically significant molecules to access the cell. Despite its importance, significant gaps remain in our comprehension, at the molecular level, of how pectin cross-links influence the mechanical and physical properties of cell walls.

View Article and Find Full Text PDF

Biostimulants are an emerging and innovative class of products that may mitigate the adverse effects of extreme heat, but research on their efficacy in fruit crops is limited. This study addressed this knowledge gap by evaluating the performance of three biostimulants, FRUIT ARMOR™, Optysil®, and KelpXpress™ [active ingredients glycine betaine, silicon, and kelp (Ascophyllum nodosum) extract, respectively] applied to three raspberry genotypes exposed to high temperatures (T ≥ 35 °C/day) inside a glasshouse. 'Meeker' consistently maintained high chlorophyll fluorescence (F/F) and photosynthesis under control and biostimulant treatments.

View Article and Find Full Text PDF

LSD600: the first corpus of biomedical abstracts annotated with lifestyle-disease relations.

Database (Oxford)

January 2025

Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Blegdamsvej 3, Copenhagen 2200, Denmark.

Lifestyle factors (LSFs) are increasingly recognized as instrumental in both the development and control of diseases. Despite their importance, there is a lack of methods to extract relations between LSFs and diseases from the literature, a step necessary to consolidate the currently available knowledge into a structured form. As simple co-occurrence-based relation extraction (RE) approaches are unable to distinguish between the different types of LSF-disease relations, context-aware models such as transformers are required to extract and classify these relations into specific relation types.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!