Publications by authors named "Alex Thomo"

To address privacy and ethical issues in using health data for machine learning, we evaluate the scalability of advanced synthetic data generation methods like GANs, VAEs, copulaGAN, and transformer models specifically for patient service utilization data. Our study examines five models on data from a Canadian health authority, focusing on training and generation efficiency, data resemblance, and practical utility. Our findings indicate that statistical models excel in efficiency, while most models produce synthetic data that closely mirrors real data, and is also useful for real-world applications.

View Article and Find Full Text PDF
PubMed Retrieval with RAG Techniques.

Stud Health Technol Inform

August 2024

This study explores the application of Retriever-Augmented Generation (RAG) in enhancing medical information retrieval from the PubMed database. By integrating RAG with Large Language Models (LLMs), we aim to improve the accuracy and relevance of medical information provided to healthcare professionals. Our evaluation on a labeled dataset of 1,000 queries demonstrates promising results in answer relevance, while highlighting areas for improvement in groundedness and context relevance.

View Article and Find Full Text PDF

Truss decomposition is a popular notion of hierarchical dense substructures in graphs. In a nutshell, -truss is the largest subgraph in which every edge is contained in at least triangles. Truss decomposition aims to compute -trusses for each possible value of .

View Article and Find Full Text PDF

The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is responsible for millions of deaths around the world. To help contribute to the understanding of crucial knowledge and to further generate new hypotheses relevant to SARS-CoV-2 and human protein interactions, we make use of the information abundant Biomine probabilistic database and extend the experimentally identified SARS-CoV-2-human protein-protein interaction (PPI) network in silico. We generate an extended network by integrating information from the Biomine database, the PPI network and other experimentally validated results.

View Article and Find Full Text PDF

The objective of this study is to apply data mining techniques to determine factors that are commonly associated with liver cancer incidence, using an anonymized data set of 6064 patients from the British Columbia Cancer Agency (BCCA). The association rules indicate that in BC the patient demographic factors associated with increased liver cancer include: age ranges 60-69, male gender, and geographic location in the Greater Vancouver area. The main factors associated with decreased survivability in BC were being male and in the age range 70-79.

View Article and Find Full Text PDF