Publications by Michel Oleynik

Publications by authors named "Michel Oleynik"

Page 1 of 1

Medical Information Extraction in the Age of Deep Learning.

Yearb Med Inform

August 2020

Objectives: We survey recent developments in medical Information Extraction (IE) as reported in the literature from the past three years. Our focus is on the fundamental methodological paradigm shift from standard Machine Learning (ML) techniques to Deep Neural Networks (DNNs). We describe applications of this new paradigm concentrating on two basic IE tasks, named entity recognition and relation extraction, for two selected semantic classes-diseases and drugs (or medications)-and relations between them.

View Article and Find Full Text PDF

Leveraging PubMed to Create a Specialty-Based Sense Inventory for Spanish Acronym Resolution.

Alexandra Pomares-Quimbaya Pilar López-Úbeda Michel Oleynik Stefan Schulz

Stud Health Technol Inform

June 2020

Acronyms frequently occur in clinical text, which makes their identification, disambiguation and resolution an important task in clinical natural language processing. This paper contributes to acronym resolution in Spanish through the creation of a set of sense inventories organized by clinical specialty containing acronyms, their expansions, and corpus-driven features. The new acronym resource is composed of 51 clinical specialties with 3,603 acronyms in total, from which we identified 228 language independent acronyms and 391 language dependent expansions.

View Article and Find Full Text PDF

Character-Level Neural Language Modelling in the Clinical Domain.

Markus Kreuzthaler Michel Oleynik Stefan Schulz

Stud Health Technol Inform

June 2020

Word embeddings have become the predominant representation scheme on a token-level for various clinical natural language processing (NLP) tasks. More recently, character-level neural language models, exploiting recurrent neural networks, have again received attention, because they achieved similar performance against various NLP benchmarks. We investigated to what extent character-based language models can be applied to the clinical domain and whether they are able to capture reasonable lexical semantics using this maximally fine-grained representation scheme.

View Article and Find Full Text PDF

Evaluating shallow and deep learning strategies for the 2018 n2c2 shared task on clinical text classification.

Michel Oleynik Amila Kugic Zdenko Kasáč Markus Kreuzthaler

J Am Med Inform Assoc

November 2019

Objective: Automated clinical phenotyping is challenging because word-based features quickly turn it into a high-dimensional problem, in which the small, privacy-restricted, training datasets might lead to overfitting. Pretrained embeddings might solve this issue by reusing input representation schemes trained on a larger dataset. We sought to evaluate shallow and deep learning text classifiers and the impact of pretrained embeddings in a small clinical dataset.

View Article and Find Full Text PDF

Unsupervised Abbreviation Expansion in Clinical Narratives.

Michel Oleynik Markus Kreuzthaler Stefan Schulz

Stud Health Technol Inform

June 2018

Clinical narratives are typically produced under time pressure, which incites the use of abbreviations and acronyms. To expand such short forms in a correct way eases text comprehension and further semantic processing. We propose a completely unsupervised and data-driven algorithm for the resolution of non-lexicalised and potentially ambiguous abbreviations.

View Article and Find Full Text PDF

Automated Classification of Semi-Structured Pathology Reports into ICD-O Using SVM in Portuguese.

Michel Oleynik Diogo F C Patrão Marcelo Finger

Stud Health Technol Inform

October 2017

Pathology reports are a main source of information regarding cancer diagnosis and are commonly written following semi-structured templates that include tumour localisation and behaviour. In this work, we evaluated the efficiency of support vector machines (SVMs) to classify pathology reports written in Portuguese into the International Classification of Diseases for Oncology (ICD-O), a biaxial classification of cancer topography and morphology. A partnership program with the Brazilian hospital A.

View Article and Find Full Text PDF

Automated Classification of Pathology Reports.

Michel Oleynik Marcelo Finger Diogo F C Patrão

Stud Health Technol Inform

December 2016

This work develops an automated classifier of pathology reports which infers the topography and the morphology classes of a tumor using codes from the International Classification of Diseases for Oncology (ICD-O). Data from 94,980 patients of the A.C.

View Article and Find Full Text PDF

Recruit--An Ontology Based Information Retrieval System for Clinical Trials Recruitment.

Diogo F C Patrão Michel Oleynik Felipe Massicano Ariane Morassi Sasso

Stud Health Technol Inform

December 2016

Clinical trials are studies designed to assess whether a new intervention is better than the current alternatives. However, most of them fail to recruit participants on schedule. It is hard to use Electronic Health Record (EHR) data to find eligible patients, therefore studies rely on manual assessment, which is time consuming, inefficient and requires specialized training.

View Article and Find Full Text PDF

Performance analysis of a POS tagger applied to discharge summaries in Portuguese.

Michel Oleynik Percy Nohama Pindaro Secco Cancian Stefan Schulz

Stud Health Technol Inform

April 2011

Part of speech taggers need a considerable amount of data to train their models. Such data is not readily available for medical texts in Portuguese. We evaluated the accuracy of a morphological tagger against a gold standard when trained with corpora of different sizes and domains.

View Article and Find Full Text PDF