The usage of various software applications has grown tremendously due to the onset of Industry 4.0, giving rise to the accumulation of all forms of data. The scientific, biological, and social media text collections demand efficient machine learning methods for data interpretability, which organizations need in decision-making of all sorts. The topic models can be applied in text mining of biomedical articles, scientific articles, Twitter data, and blog posts. This paper analyzes and provides a comparison of the performance of Latent Dirichlet Allocation (LDA), Dynamic Topic Model (DTM), and Embedded Topic Model (ETM) techniques. An incremental topic model with word embedding (ITMWE) is proposed that processes large text data in an incremental environment and extracts latent topics that best describe the document collections. Experiments in both offline and online settings on large real-world document collections such as CORD-19, NIPS papers, and Tweet datasets show that, while LDA and DTM is a good model for discovering word-level topics, ITMWE discovers better document-level topic groups more efficiently in a dynamic environment, which is crucial in text mining applications.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9676895PMC
http://dx.doi.org/10.1007/s41870-022-01123-4DOI Listing

Publication Analysis

Top Keywords

topic model
12
large text
8
text mining
8
document collections
8
text
5
topic
5
extracting inferences
4
inferences large
4
text corpus
4
corpus usage
4

Similar Publications

Background: Laparoscopic surgery training is a demanding process requiring technical and nontechnical skills. Surgical training has evolved from traditional approaches to the use of immersive digital technologies such as virtual, augmented, and mixed reality. These technologies are now integral to laparoscopic surgery training.

View Article and Find Full Text PDF

Background: In data-sparse areas such as health care, computer scientists aim to leverage as much available information as possible to increase the accuracy of their machine learning models' outputs. As a standard, categorical data, such as patients' gender, socioeconomic status, or skin color, are used to train models in fusion with other data types, such as medical images and text-based medical information. However, the effects of including categorical data features for model training in such data-scarce areas are underexamined, particularly regarding models intended to serve individuals equitably in a diverse population.

View Article and Find Full Text PDF

Assessing the performance of AI chatbots in answering patients' common questions about low back pain.

Ann Rheum Dis

January 2025

Masters and Doctoral Programs in Physical Therapy, Universidade Cidade de Sao Paulo, Sao Paulo, Brazil; Discipline of Physiotherapy, Graduate School of Health, Faculty of Health, University of Technology, Sydney, New South Wales, Australia.

Objectives: The aim of this study was to assess the accuracy and readability of the answers generated by large language model (LLM)-chatbots to common patient questions about low back pain (LBP).

Methods: This cross-sectional study analysed responses to 30 LBP-related questions, covering self-management, risk factors and treatment. The questions were developed by experienced clinicians and researchers and were piloted with a group of consumer representatives with lived experience of LBP.

View Article and Find Full Text PDF

Background: The rising number of cancer survivors and the shortage of health care professionals challenge the accessibility of cancer care. Health technologies are necessary for sustaining optimal patient journeys. To understand individuals' daily lives during their patient journey, qualitative studies are crucial.

View Article and Find Full Text PDF

Entangled Illnesses: Embodied Experiences of Managing Multimorbidity.

Sociol Health Illn

February 2025

Tampere Centre for Science, Technology and Innovation Studies, Faculty of Social Sciences, Tampere University, Tampere, Finland.

Multimorbidity, meaning multiple long-term conditions impacting a person's health, has become a rising societal and public health issue. The article contributes to the sociological study of chronic illness and multimorbidity by analysing how the blurriness of illnesses and entanglement of symptoms in multimorbidity is experienced and negotiated by people with coexisting chronic conditions. Drawing on qualitative interviews with people who live with endometriosis, fibromyalgia or hormonal migraine in Finland, we show how people with multiple chronic conditions distinguish between evolving symptoms based on past embodied experiences to make decisions about how to best manage their health.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!