Natural Language Processing Approaches for Automated Multilevel and Multiclass Classification of Breast Lesions on Free-Text Cytopathology Reports.

JCO Clin Cancer Inform

Department of Pathology, JSS Academy of Higher Education and Research, Mysuru, Karnataka, India.

Published: September 2022

Purpose: The extensive growth and use of electronic health records (EHRs) and extending medical literature have led to huge opportunities to automate the extraction of relevant clinical information that helps in concise and effective clinical decision support. However, processing such information has traditionally been dependent on labor-intensive processes with human errors such as fatigue, oversight, and interobserver variability. Hence, this study aims at the processing of EHRs and performing multilevel and multiclass classification by fetching dominant characteristic features that are sufficient to detect and differentiate various types of breast lesions.

Patients And Methods: In this study, unstructured EHRs on breast lesions obtained through fine-needle aspiration cytology technique are considered. The raw text was normalized into structured tabular form and converted to scores by performing sentiment analysis that helps to decide the total polarity or class label of the EHR. Supervised machine learning approaches, namely random forest and feed-forward neural network trained using Levenberg-Marquardt training function, are used for classification of the collected EHR data set containing 2,879 records that are split in the ratio of 80:20 as training and testing data sets, respectively.

Results: Random forest and feed-forward neural network classifiers gave the best performance with an accuracy of 99.36%, an overall receiver operating characteristic-area under the curve of 99.2%, a correlation with ground truth of 98.3%, and a histopathologic correlation of 98.6%.

Conclusion: Natural language processing has huge potential to automate the extraction of clinical features from breast lesions. The proposed multilevel and multiclass classification approach is used to classify 13 different types of breast lesions with 20 different labels into five classes to decide the type of treatment that should be given to patients by a physician or oncologist.

Download full-text PDF

Source
http://dx.doi.org/10.1200/CCI.22.00036DOI Listing

Publication Analysis

Top Keywords

breast lesions
16
multilevel multiclass
12
multiclass classification
12
natural language
8
language processing
8
automate extraction
8
types breast
8
random forest
8
forest feed-forward
8
feed-forward neural
8

Similar Publications

ManiNeg: Manifestation-guided multimodal pretraining for mammography screening.

Comput Biol Med

January 2025

School of Automation Science and Engineering, South China University of Technology, Guangzhou, China. Electronic address:

Breast cancer poses a significant health threat worldwide. Contrastive learning has emerged as an effective method to extract critical lesion features from mammograms, thereby offering a potent tool for breast cancer screening and analysis. A crucial aspect of contrastive learning is negative sampling, where the selection of hard negative samples is essential for driving representations to retain detailed lesion information.

View Article and Find Full Text PDF

Background The incidence of margin re-excision following breast conserving surgery (BCS) is a quality measure in the National Health Service. The threshold is less than 20% of all BCS procedures. Despite three decades of studies and a wealth of literature identifying multiple factors associated with increased risk for margin involvement, an accepted threshold rate affecting one in five procedures remains high.

View Article and Find Full Text PDF

Background: Dermatomyositis is a chronic inflammatory condition affecting muscles and skin, often associated with an increased risk of cancer. Specific autoantibodies, including anti-TIF1 (Transcription Intermediary Factor 1), have been linked to this risk. We present a case of dermatomyositis in a male patient positive for anti-TIF1 antibodies, subsequently diagnosed with squamous cell carcinoma of the tonsil, a novel association not previously documented.

View Article and Find Full Text PDF

This study aimed to explore the diagnostic value of the two cytology techniques, including liquid-based cytology of mammary ductal lavage fluid and nipple discharge smear cytology, in the intraductal lesions in patients with pathological nipple discharge (PND). This retrospective analysis included 119 patients with PND who underwent surgical treatment. At the same time, they all underwent fiberoptic ductoscopy (FDS), nipple discharge smear cytology and liquid-based cytology of ductal lavage fluid before surgery.

View Article and Find Full Text PDF

Myofibroblastoma is a rare mesenchymal tumour known for its benign nature but complex diagnostic pathway. A woman in her 40s presented with a painless breast mass, initially reported as a fibroadenoma on ultrasound mammography and as a benign to borderline phyllodes tumour on fine needle aspiration cytology. Contrast-enhanced CT was reported as carcinoma of the breast with Breast Imaging and Reporting Data System (BIRADS)-6.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!