Objective: The study sought to develop and evaluate a knowledge-based data augmentation method to improve the performance of deep learning models for biomedical natural language processing by overcoming training data scarcity.
Materials And Methods: We extended the easy data augmentation (EDA) method for biomedical named entity recognition (NER) by incorporating the Unified Medical Language System (UMLS) knowledge and called this method UMLS-EDA. We designed experiments to systematically evaluate the effect of UMLS-EDA on popular deep learning architectures for both NER and classification. We also compared UMLS-EDA to BERT.
Results: UMLS-EDA enables substantial improvement for NER tasks from the original long short-term memory conditional random fields (LSTM-CRF) model (micro-F1 score: +5%, + 17%, and +15%), helps the LSTM-CRF model (micro-F1 score: 0.66) outperform LSTM-CRF with transfer learning by BERT (0.63), and improves the performance of the state-of-the-art sentence classification model. The largest gain on micro-F1 score is 9%, from 0.75 to 0.84, better than classifiers with BERT pretraining (0.82).
Conclusions: This study presents a UMLS-based data augmentation method, UMLS-EDA. It is effective at improving deep learning models for both NER and sentence classification, and contributes original insights for designing new, superior deep learning approaches for low-resource biomedical domains.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7973470 | PMC |
http://dx.doi.org/10.1093/jamia/ocaa309 | DOI Listing |
Viruses
December 2024
Life Sciences, Health, and Engineering Department, The Roux Institute, Northeastern University, Portland, ME 04101, USA.
JC polyomavirus (JCPyV) establishes a persistent, asymptomatic kidney infection in most of the population. However, JCPyV can reactivate in immunocompromised individuals and cause progressive multifocal leukoencephalopathy (PML), a fatal demyelinating disease with no approved treatment. Mutations in the hypervariable non-coding control region (NCCR) of the JCPyV genome have been linked to disease outcomes and neuropathogenesis, yet few metanalyses document these associations.
View Article and Find Full Text PDFNutrients
January 2025
Center of Excellence Food Technology and Nutrition, University of Applied Sciences Upper Austria, Stelzhamerstraße 23, 4600 Wels, Austria.
Individuals with special metabolic demands are at risk of deficiencies in fat-soluble vitamins, which can be counteracted via supplementation. Here, we tested the ability of micellization alone or in combination with selected natural plant extracts to increase the intestinal absorption and bioefficacy of fat-soluble vitamins. Micellated and nonmicellated vitamins D3 (cholecalciferol), D2 (ergocalciferol), E (alpha tocopheryl acetate), and K2 (menaquionone-7) were tested in intestinal Caco-2 or buccal TR146 cells in combination with curcuma (), black pepper (), or ginger () plant extracts.
View Article and Find Full Text PDFPolymers (Basel)
January 2025
School of Aeronautics and Astronautics, Shanghai Jiao Tong University, Shanghai 200240, China.
Due to the complex and uncertain physics of lightning strike on carbon fiber-reinforced polymer (CFRP) laminates, conventional numerical simulation methods for assessing the residual strength of lightning-damaged CFRP laminates are highly time-consuming and far from pretty. To overcome these challenges, this study proposes a new prediction method for the residual strength of CFRP laminates based on machine learning. A diverse dataset is acquired and augmented from photographs of lightning strike damage areas, C-scan images, mechanical performance data, layup details, and lightning current parameters.
View Article and Find Full Text PDFPolymers (Basel)
January 2025
Department of Chemistry, Bar-Ilan Institute for Nanotechnology and Advanced Materials, Bar-Ilan University, Ramat-Gan 52900, Israel.
Amidst the pervasive threat of bacterial afflictions, the imperative for advanced antibiofilm surfaces with robust antimicrobial efficacy looms large. This study unveils a sophisticated ultrasonic synthesis method for cellulose nanocrystals (CNCs, 10-20 nm in diameter and 300-900 nm in length) and their subsequent application as coatings on flexible substrates, namely cotton (CC-1) and membrane (CM-1). The cellulose nanocrystals showed excellent water repellency with a water contact angle as high as 148° on the membrane.
View Article and Find Full Text PDFSensors (Basel)
January 2025
Department of Mechanical Engineering, Tsinghua University, Beijing 100084, China.
Predicting the time series energy consumption data of manufacturing processes can optimize energy management efficiency and reduce maintenance costs for enterprises. Using deep learning algorithms to establish prediction models for sensor data is an effective approach; however, the performance of these models is significantly influenced by the quantity and quality of the training data. In real production environments, the amount of time series data that can be collected during the manufacturing process is limited, which can lead to a decline in model performance.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!