Information extraction and knowledge discovery regarding adverse drug reaction (ADR) from large-scale clinical texts are very useful and needy processes. Two major difficulties of this task are the lack of domain experts for labeling examples and intractable processing of unstructured clinical texts. Even though most previous works have been conducted on these issues by applying semisupervised learning for the former and a word-based approach for the latter, they face with complexity in an acquisition of initial labeled data and ignorance of structured sequence of natural language. In this study, we propose automatic data labeling by distant supervision where knowledge bases are exploited to assign an relation label for each drug-event pair in texts, and then, we use patterns for characterizing ADR relation. The multiple-instance learning with expectation-maximization method is employed to estimate model parameters. The method applies transductive learning to iteratively reassign a probability of unknown drug-event pair at the training time. By investigating experiments with 50,998 discharge summaries, we evaluate our method by varying large number of parameters, that is, pattern types, pattern-weighting models, and initial and iterative weightings of relations for unlabeled data. Based on evaluations, our proposed method outperforms the word-based feature for NB-EM (iEM), MILR, and TSVM with F1 score of 11.3%, 9.3%, and 6.5% improvement, respectively.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5635478 | PMC |
http://dx.doi.org/10.1155/2017/7575280 | DOI Listing |
PLoS One
January 2025
Xinjiang Institute of Technology, Aksu, China.
Facial expression recognition faces great challenges due to factors such as face similarity, image quality, and age variation. Although various existing end-to-end Convolutional Neural Network (CNN) architectures have achieved good classification results in facial expression recognition tasks, these network architectures share a common drawback that the convolutional kernel can only compute the correlation between elements of a localized region when extracting expression features from an image. This leads to difficulties for the network to explore the relationship between all the elements that make up a complete expression.
View Article and Find Full Text PDFJ Transl Med
January 2025
Hepatobiliary Center, The First Affiliated Hospital of Nanjing Medical University, Nanjing, 210029, China.
Background: Colorectal cancer (CRC) exhibits a high incidence globally, with the liver being the most common site of distant metastasis. At the time of diagnosis, 20-30% of CRC patients already present with liver metastases. Colorectal liver metastasis (CRLM) is a major cause of mortality among CRC patients.
View Article and Find Full Text PDFJ Cancer Res Ther
December 2024
Department of Urology, The Affiliated Suzhou Hospital of Nanjing Medical University, Suzhou Municipal Hospital, Gusu School, Nanjing Medical University, Suzhou, Jiangsu Province, People's Republic of China.
Background: To evaluate the association of demographic and clinicopathological characteristics with the survival of patients with testicular mixed teratoma and seminoma (TMTS).
Methods: The data of 3296 eligible patients with TMTS who underwent surgery between 2010 and 2015 were obtained from the Surveillance, Epidemiology, and End Results database. Overall survival (OS) and cancer-specific survival (CSS) were determined using the Kaplan-Meier survival curves.
Br J Surg
December 2024
Department of Surgery, Skåne University Hospital, Malmö, Sweden.
Background: Tumour deposits are a prognostic factor for overall survival and distant metastasis in lymph node-negative colorectal cancer. However, the current TNM staging system does not account for the presence of tumour deposits in lymph node-positive colorectal cancer, or for the presence of multiple deposits. This study aimed to investigate the prognostic effect of tumour deposit count in patients with colorectal cancer.
View Article and Find Full Text PDFMAbs
December 2025
Centre for Misfolding Diseases, Yusuf Hamied Department of Chemistry, University of Cambridge, Cambridge, UK.
In-silico prediction of protein biophysical traits is often hindered by the limited availability of experimental data and their heterogeneity. Training on limited data can lead to overfitting and poor generalizability to sequences distant from those in the training set. Additionally, inadequate use of scarce and disparate data can introduce biases during evaluation, leading to unreliable model performances being reported.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!