Generative named entity recognition framework for Chinese legal domain.

PeerJ Comput Sci

School of Computer Science and Engineering, Central South University, Changsha, Hunan, China.

Published: November 2024

Named entity recognition (NER) is a crucial task in natural language processing, particularly challenging in the legal domain due to the intricate and lengthy nature of legal entities. Existing methods often struggle with accurately identifying entity boundaries and types in legal texts. To address these challenges, we propose a novel sequence-to-sequence framework designed specifically for the legal domain. This framework features an entity-type-aware module that leverages contrastive learning to enhance the prediction of entity types. Additionally, we incorporate a decoder with a copy mechanism that accurately identifies complex legal entities without the need for explicit tagging schemas. Our extensive experiments on two legal datasets show that our framework significantly outperforms state-of-the-art methods, achieving notable improvements in precision, recall, and F1 score. This demonstrates the effectiveness of our approach in improving entity recognition in legal texts, offering a promising direction for future research in legal NER.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11622873PMC
http://dx.doi.org/10.7717/peerj-cs.2428DOI Listing

Publication Analysis

Top Keywords

entity recognition
12
legal domain
12
legal
9
named entity
8
legal entities
8
legal texts
8
entity
5
generative named
4
framework
4
recognition framework
4

Similar Publications

Background: Hemophagocytic lymphohistiocytosis (HLH) is a non-neoplastic proliferation and macrophage activation that induces cytokine-mediated bone marrow suppression and features of intense phagocytosis in the bone marrow and liver, leading to multi-organ dysfunction and ultimate failure. The diagnosis of HLH in an intensive care setting is challenging, and it is associated with high morbidity and mortality. HLH-94 is the standard protocol for treatment, consisting of dexamethasone and chemotherapy like etoposide.

View Article and Find Full Text PDF

ALK-positive large B-cell lymphoma (ALK+ LBCL) is a rare neoplasm with an aggressive course and poor therapeutic response to the standard R-CHOP regimen. Owing to its negativity for usual B- and T-cell markers and immunopositivity for epithelial markers, it can be easily misdiagnosed if it is not contemplated. To study the clinicopathological parameters of cases of ALK+ LBCL diagnosed at our institution.

View Article and Find Full Text PDF

Background: Natural language processing (NLP) enables the extraction of information embedded within unstructured texts, such as clinical case reports and trial eligibility criteria. By identifying relevant medical concepts, NLP facilitates the generation of structured and actionable data, supporting complex tasks like cohort identification and the analysis of clinical records. To accomplish those tasks, we introduce a deep learning-based and lexicon-based named entity recognition (NER) tool for texts in Spanish.

View Article and Find Full Text PDF

Objectives: The National Library of Medicine (NLM) currently indexes close to a million articles each year pertaining to more than 5300 medicine and life sciences journals. Of these, a significant number of articles contain critical information about the structure, genetics, and function of genes and proteins in normal and disease states. These articles are identified by the NLM curators, and a manual link is created between these articles and the corresponding gene records at the NCBI Gene database.

View Article and Find Full Text PDF

Native and Non-Native Speakers' Recognition of Chinese Two-Character Words in Audio Sentence Comprehension.

Behav Sci (Basel)

December 2024

College of Chinese Language and Literature, Qufu Normal University, No. 57, Jingxuan Road, Qufu 273165, China.

Two experiments were conducted to examine native and non-native speakers' recognition of Chinese two-character words (2C-words) in the context of audio sentence comprehension. The recording was played of a sentence, in which a collocation composed of a number word, a sortal classifier, and a noun (NCN) was embedded. When the participants were about to hear the noun of the NCN (Noun), the playing stopped, and a target was visually presented, which was the Noun, the character-transposed word of the Noun (NounT), or a control word (NounC), or was a homophone nonword for Noun, NounT, or NounC.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!