Publications by authors named "Illes Solt"

Article Synopsis
  • Scientists have created a method to find new information about how certain proteins (called transcription factors) control human genes by looking through many scientific articles.
  • They discovered over 45,000 sentences that may describe these relationships, and by checking them, they found more than 300 unique interactions not listed before.
  • This new information improves our understanding of human genetics, especially in identifying genes linked to diseases, and is available for anyone to use online.
View Article and Find Full Text PDF

Background: Kernel-based classification is the current state-of-the-art for extracting pairs of interacting proteins (PPIs) from free text. Various proposals have been put forward, which diverge especially in the specific kernel function, the type of input representation, and the feature sets. These proposals are regularly compared to each other regarding their overall performance on different gold standard corpora, but little is known about their respective performance on the instance level.

View Article and Find Full Text PDF

Background: We report the Gene Normalization (GN) challenge in BioCreative III where participating teams were asked to return a ranked list of identifiers of the genes detected in full-text articles. For training, 32 fully and 500 partially annotated articles were prepared. A total of 507 articles were selected as the test set.

View Article and Find Full Text PDF

Summary: Identifying mentions of named entities, such as genes or diseases, and normalizing them to database identifiers have become an important step in many text and data mining pipelines. Despite this need, very few entity normalization systems are publicly available as source code or web services for biomedical text mining. Here we present the Gnat Java library for text retrieval, named entity recognition, and normalization of gene and protein mentions in biomedical text.

View Article and Find Full Text PDF

Objective: In the i2b2 Medication Extraction Challenge, medication names together with details of their administration were to be extracted from medical discharge summaries.

Design: The task of the challenge was decomposed into three pipelined components: named entity identification, context-aware filtering and relation extraction. For named entity identification, first a rule-based (RB) method that was used in our overall fifth place-ranked solution at the challenge was investigated.

View Article and Find Full Text PDF

OBJECTIVE Automated and disease-specific classification of textual clinical discharge summaries is of great importance in human life science, as it helps physicians to make medical studies by providing statistically relevant data for analysis. This can be further facilitated if, at the labeling of discharge summaries, semantic labels are also extracted from text, such as whether a given disease is present, absent, questionable in a patient, or is unmentioned in the document. The authors present a classification technique that successfully solves the semantic classification task.

View Article and Find Full Text PDF