Publications by Robert Bossy

Publications by authors named "Robert Bossy"

Page 1 of 1

TaeC: A manually annotated text dataset for trait and phenotype extraction and entity linking in wheat breeding literature.

Claire Nédellec Clara Sauvion Robert Bossy Mariya Borovikova Louise Deléger

PLoS One

June 2024

Wheat varieties show a large diversity of traits and phenotypes. Linking them to genetic variability is essential for shorter and more efficient wheat breeding programs. A growing number of plant molecular information networks provide interlinked interoperable data to support the discovery of gene-phenotype interactions.

View Article and Find Full Text PDF

MilkOligoThesaurus, a dataset of mammalian milk oligosaccharide synonyms.

Mathilde Rumeau François Fenaille Agnès Girard Valentin Loux Mouhamadou Ba Robert Bossy

Data Brief

June 2024

There is a growing interest in milk oligosaccharides (MOs) because of their numerous benefits for newborns' and long-term health. A large number of MO structures have been identified in mammalian milk. Mostly described in human milk, the oligosaccharide richness, although less broad, has also been reported for a wide range of mammalian species.

View Article and Find Full Text PDF

Combining different points of view on plant descriptions: mapping agricultural plant roles and biological taxa.

Florence Amardeilh Sophie Aubin Stephan Bernard Sonia Bravo Robert Bossy

Front Artif Intell

September 2023

This article describes our study on the alignment of two complementary knowledge graphs useful in agriculture: the thesaurus of cultivated plants in France named French Crop Usage (FCU) and the French national taxonomic repository TAXREF for fauna, flora, and fungi. FCU describes the usages of plants in agriculture: "" are crops used for human food, and "" are crops used for human beverage. TAXREF describes biological taxa and associated scientific names: for example, a tomato species may be "" or a grapevine species may be "".

View Article and Find Full Text PDF

Omnicrobe, an open-access database of microbial habitats and phenotypes using a comprehensive text mining and data fusion approach.

Sandra Dérozier Robert Bossy Louise Deléger Mouhamadou Ba Estelle Chaix

PLoS One

January 2023

The dramatic increase in the number of microbe descriptions in databases, reports, and papers presents a two-fold challenge for accessing the information: integration of heterogeneous data in a standard ontology-based representation and normalization of the textual descriptions by semantic analysis. Recent text mining methods offer powerful ways to extract textual information and generate ontology-based representation. This paper describes the design of the Omnicrobe application that gathers comprehensive information on habitats, phenotypes, and usages of microbes from scientific sources of high interest to the microbiology community.

View Article and Find Full Text PDF

Do syntactic trees enhance Bidirectional Encoder Representations from Transformers (BERT) models for chemical-drug relation extraction?

Anfu Tang Louise Deléger Robert Bossy Pierre Zweigenbaum Claire Nédellec

Database (Oxford)

August 2022

Collecting relations between chemicals and drugs is crucial in biomedical research. The pre-trained transformer model, e.g.

View Article and Find Full Text PDF

C-Norm: a neural approach to few-shot entity normalization.

Arnaud Ferré Louise Deléger Robert Bossy Pierre Zweigenbaum Claire Nédellec

BMC Bioinformatics

December 2020

Background: Entity normalization is an important information extraction task which has gained renewed attention in the last decade, particularly in the biomedical and life science domains. In these domains, and more generally in all specialized domains, this task is still challenging for the latest machine learning-based approaches, which have difficulty handling highly multi-class and few-shot learning problems. To address this issue, we propose C-Norm, a new neural approach which synergistically combines standard and weak supervision, ontological knowledge integration and distributional semantics.

View Article and Find Full Text PDF

WTO, an ontology for wheat traits and phenotypes in scientific publications.

Claire Nédellec Liliana Ibanescu Robert Bossy Pierre Sourdille

Genomics Inform

June 2020

Phenotyping is a major issue for wheat agriculture to meet the challenges of adaptation of wheat varieties to climate change and chemical input reduction in crop. The need to improve the reuse of observations and experimental data has led to the creation of reference ontologies to standardize descriptions of phenotypes and to facilitate their comparison. The scientific literature is largely under-exploited, although extremely rich in phenotype descriptions associated with cultivars and genetic information.

View Article and Find Full Text PDF

Improving the CONTES method for normalizing biomedical text entities with concepts from an ontology with (almost) no training data.

Arnaud Ferré Mouhamadou Ba Robert Bossy

Genomics Inform

June 2019

Entity normalization, or entity linking in the general domain, is an information extraction task that aims to annotate/bind multiple words/expressions in raw text with semantic references, such as concepts of an ontology. An ontology consists minimally of a formally organized vocabulary or hierarchy of terms, which captures knowledge of a domain. Presently, machine-learning methods, often coupled with distributional representations, achieve good performance.

View Article and Find Full Text PDF

Text mining tools for extracting information about microbial biodiversity in food.

Estelle Chaix Louise Deléger Robert Bossy Claire Nédellec

Food Microbiol

August 2019

Information on food microbial diversity is scattered across millions of scientific papers. Researchers need tools to assist their bibliographic search in such large collections. Text mining and knowledge engineering methods are useful to automatically and efficiently find relevant information in Life Science.

View Article and Find Full Text PDF

Text mining resources for the life sciences.

Piotr Przybyła Matthew Shardlow Sophie Aubin Robert Bossy Richard Eckart de Castilho

Database (Oxford)

November 2017

Text mining is a powerful technology for quickly distilling key information from vast quantities of biomedical literature. However, to harness this power the researcher must be well versed in the availability, suitability, adaptability, interoperability and comparative accuracy of current text mining resources. In this survey, we give an overview of the text mining resources that exist in the life sciences to help researchers, especially those employed in biocuration, to engage with text mining in their own work.

View Article and Find Full Text PDF

Overview of the gene regulation network and the bacteria biotope tasks in BioNLP'13 shared task.

Robert Bossy Wiktoria Golik Zorana Ratkovic Dialekti Valsamou Philippe Bessières

BMC Bioinformatics

February 2016

Background: We present the two Bacteria Track tasks of BioNLP 2013 Shared Task (ST): Gene Regulation Network (GRN) and Bacteria Biotope (BB). These tasks were previously introduced in the 2011 BioNLP-ST Bacteria Track as Bacteria Gene Interaction (BI) and Bacteria Biotope (BB). The Bacteria Track was motivated by a need to develop specific BioNLP tools for fine-grained event extraction in bacteria biology.

View Article and Find Full Text PDF

BioNLP Shared Task--The Bacteria Track.

Robert Bossy Julien Jourde Alain-Pierre Manine Philippe Veber Erick Alphonse

BMC Bioinformatics

June 2012

Background: We present the BioNLP 2011 Shared Task Bacteria Track, the first Information Extraction challenge entirely dedicated to bacteria. It includes three tasks that cover different levels of biological knowledge. The Bacteria Gene Renaming supporting task is aimed at extracting gene renaming and gene name synonymy in PubMed abstracts.

View Article and Find Full Text PDF

Complete genome sequence of the fish pathogen Flavobacterium psychrophilum.

Eric Duchaud Mekki Boussaha Valentin Loux Jean-François Bernardet Christian Michel Robert Bossy

Nat Biotechnol

July 2007

We report here the complete genome sequence of the virulent strain JIP02/86 (ATCC 49511) of Flavobacterium psychrophilum, a widely distributed pathogen of wild and cultured salmonid fish. The genome consists of a 2,861,988-base pair (bp) circular chromosome with 2,432 predicted protein-coding genes. Among these predicted proteins, stress response mediators, gliding motility proteins, adhesins and many putative secreted proteases are probably involved in colonization, invasion and destruction of the host tissues.

View Article and Find Full Text PDF

The complete genome sequence of the meat-borne lactic acid bacterium Lactobacillus sakei 23K.

Stéphane Chaillou Marie-Christine Champomier-Vergès Monique Cornet Anne-Marie Crutz-Le Coq Anne-Marie Dudez Robert Bossy

Nat Biotechnol

December 2005

Lactobacillus sakei is a psychotrophic lactic acid bacterium found naturally on fresh meat and fish. This microorganism is widely used in the manufacture of fermented meats and has biotechnological potential in biopreservation and food safety. We have explored the 1,884,661-base-pair (bp) circular chromosome of strain 23K encoding 1,883 predicted genes.

View Article and Find Full Text PDF