Publications by authors named "Donghong Ji"

Emojis, utilizing visual means, mimic human facial expressions and postures to convey emotions and opinions. They are widely used in social media platforms such as Sina Weibo, and have become a crucial feature for sentiment analysis. However, existing approaches often treat emojis as special symbols or convert them into text labels, thereby neglecting the rich visual information of emojis.

View Article and Find Full Text PDF

The classic multiple instance learning (MIL) paradigm is harnessed for weakly-supervised whole slide image (WSI) classification. The spatial position relationship located between positive tissues is crucial for this task due to the small percentage of these tissues in billions of pixels, which has been overlooked by most studies. Therefore, we propose a framework called TDT-MIL.

View Article and Find Full Text PDF
Article Synopsis
  • Biomedical event extraction is gaining significant traction in fields like natural language processing and bioinformatics, leading to numerous machine learning models aimed at this task.
  • Traditional models rely on an extraction-based approach that can cause cascading errors due to their sequential processing of multiple subtasks.
  • This paper introduces a new model leveraging the T5 pre-trained language framework, using a sequence-to-sequence generation method and implementing constrained decoding and curriculum learning, which shows improved performance on the Genia 2011 and Genia 2013 datasets.
View Article and Find Full Text PDF

Aspect-Based Sentiment Analysis (ABSA) represents a fine-grained approach to sentiment analysis, aiming to pinpoint and evaluate sentiments associated with specific aspects within a text. ABSA encompasses a set of sub-tasks that together facilitate a detailed understanding of the multifaceted sentiment expressions. These tasks include aspect and opinion terms extraction (ATE and OTE), classification of sentiment at the aspect level (ALSC), the coupling of aspect and opinion terms extraction (AOE and AOPE), and the challenging integration of these elements into sentiment triplets (ASTE).

View Article and Find Full Text PDF

Convolutional neural networks (CNN), especially numerous U-shaped models, have achieved great progress in retinal vessel segmentation. However, a great quantity of global information in fundus images has not been fully explored. And the class imbalance problem of background and blood vessels is still serious.

View Article and Find Full Text PDF

Medical image segmentation enables doctors to observe lesion regions better and make accurate diagnostic decisions. Single-branch models such as U-Net have achieved great progress in this field. However, the complementary local and global pathological semantics of heterogeneous neural networks have not yet been fully explored.

View Article and Find Full Text PDF

The task of event extraction consists of three subtasks namely entity recognition, trigger identification and argument role classification. Recent work tackles these subtasks jointly with the method of multi-task learning for better extraction performance. Despite being effective, existing attempts typically treat labels of event subtasks as uninformative and independent one-hot vectors, ignoring the potential loss of useful label information, thereby making it difficult for these models to incorporate interactive features on the label level.

View Article and Find Full Text PDF
Article Synopsis
  • Aspect-based sentiment triplet extraction (ASTE) focuses on identifying triplets in texts that consist of aspect terms, opinion expressions, and their sentiment polarities, but existing methods struggle with challenges like overlapping terms and long-distance relationships.
  • The authors introduce an innovative encoder-decoder framework for ASTE that treats the task as a prediction of unordered triplet sets, employing a nonautoregressive decoding method with a pointer network.
  • Their approach includes a new high-order aggregation mechanism to manage overlaps and a bipartite matching loss to enhance training, resulting in significant performance improvements over current methods in handling overlapping issues and decoding efficiency, as confirmed by experimental results.
View Article and Find Full Text PDF

Implicit sentiment analysis is a challenging task because the sentiment of a text is expressed in a connotative manner. To tackle this problem, we propose to use textual events as a knowledge source to enrich network representations. To consider task interactions, we present a novel lightweight joint learning paradigm that can pass task-related messages between tasks during training iterations.

View Article and Find Full Text PDF

Attention has been shown highly effective for modeling sequences, capturing the more informative parts in learning a deep representation. However, recent studies show that the attention values do not always coincide with intuition in tasks, such as machine translation and sentiment classification. In this study, we consider using deep reinforcement learning to automatically optimize attention distribution during the minimization of end task training losses.

View Article and Find Full Text PDF

Motivation: Entity relation extraction is one of the fundamental tasks in biomedical text mining, which is usually solved by the models from natural language processing. Compared with traditional pipeline methods, joint methods can avoid the error propagation from entity to relation, giving better performances. However, the existing joint models are built upon sequential scheme, and fail to detect overlapping entity and relation, which are ubiquitous in biomedical texts.

View Article and Find Full Text PDF

Chinese information extraction is traditionally performed in the process of word segmentation, entity recognition, relation extraction and event detection. This pipelined approach suffers from two limitations: 1) It is prone to introduce propagated errors from upstream tasks to subsequent applications; 2) Mutual benefits of cross-task dependencies are hard to be introduced in non-overlapping models. To address these two challenges, we propose a novel transition-based model that jointly performs entity recognition, relation extraction and event detection as a single task.

View Article and Find Full Text PDF

Biomedical information extraction (BioIE) is an important task. The aim is to analyze biomedical texts and extract structured information such as named entities and semantic relations between them. In recent years, pre-trained language models have largely improved the performance of BioIE.

View Article and Find Full Text PDF
Article Synopsis
  • The paper discusses the importance of extracting keyphrases from scientific articles to enhance understanding of scientific publications.
  • It presents a neural network-based approach that utilizes a bidirectional LSTM for sentence representation and combines it with a conditional random field (CRF) to label the entire sentence.
  • The model also employs self-training to effectively use unlabeled data, demonstrating strong performance on keyphrase extraction tasks without relying on manual features or external resources, surpassing some of the previous methods.
View Article and Find Full Text PDF

Extracting knowledge from time series provides important tools for many real applications. However, many challenging problems still open due to the stochastic nature of large amount of time series. Considering this scenario, new data mining and machine learning techniques have continuously developed.

View Article and Find Full Text PDF

Background: Disease prediction based on Electronic Health Records (EHR) has become one hot research topic in biomedical community. Existing work mainly focuses on the prediction of one target disease, and little work is proposed for multiple associated diseases prediction. Meanwhile, a piece of EHR usually contains two main information: the textual description and physical indicators.

View Article and Find Full Text PDF

Background: Biomedical named entity recognition(BNER) is a crucial initial step of information extraction in biomedical domain. The task is typically modeled as a sequence labeling problem. Various machine learning algorithms, such as Conditional Random Fields (CRFs), have been successfully used for this task.

View Article and Find Full Text PDF

Motivation: Disease named entities play a central role in many areas of biomedical research, and automatic recognition and normalization of such entities have received increasing attention in biomedical research communities. Existing methods typically used pipeline models with two independent phases: (i) a disease named entity recognition (DER) system is used to find the boundaries of mentions in text and (ii) a disease named entity normalization (DEN) system is used to connect the mentions recognized to concepts in a controlled vocabulary. The main problems of such models are: (i) there is error propagation from DER to DEN and (ii) DEN is useful for DER, but pipeline models cannot utilize this.

View Article and Find Full Text PDF

Background: Extracting biomedical entities and their relations from text has important applications on biomedical research. Previous work primarily utilized feature-based pipeline models to process this task. Many efforts need to be made on feature engineering when feature-based models are employed.

View Article and Find Full Text PDF

Background: Information extraction in clinical texts enables medical workers to find out problems of patients faster as well as makes intelligent diagnosis possible in the future. There has been a lot of work about disorder mention recognition in clinical narratives. But recognition of some more complicated disorder mentions like overlapping ones is still an open issue.

View Article and Find Full Text PDF

Background: The chemical compound and drug name recognition plays an important role in chemical text mining, and it is the basis for automatic relation extraction and event identification in chemical information processing. So a high-performance named entity recognition system for chemical compound and drug names is necessary.

Methods: We developed a CHEMDNER system based on mixed conditional random fields (CRF) with word clustering for chemical compound and drug name recognition.

View Article and Find Full Text PDF

The automatic extraction of chemical information from text requires the recognition of chemical entity mentions as one of its key steps. When developing supervised named entity recognition (NER) systems, the availability of a large, manually annotated text corpus is desirable. Furthermore, large corpora permit the robust evaluation and comparison of different approaches that detect chemicals in documents.

View Article and Find Full Text PDF

A PHP Error was encountered

Severity: Warning

Message: fopen(/var/lib/php/sessions/ci_sessions2p5eos9rb4irs2opp8l2c4kt0g35fkv): Failed to open stream: No space left on device

Filename: drivers/Session_files_driver.php

Line Number: 177

Backtrace:

File: /var/www/html/index.php
Line: 316
Function: require_once

A PHP Error was encountered

Severity: Warning

Message: session_start(): Failed to read session data: user (path: /var/lib/php/sessions)

Filename: Session/Session.php

Line Number: 137

Backtrace:

File: /var/www/html/index.php
Line: 316
Function: require_once