Wheat varieties show a large diversity of traits and phenotypes. Linking them to genetic variability is essential for shorter and more efficient wheat breeding programs. A growing number of plant molecular information networks provide interlinked interoperable data to support the discovery of gene-phenotype interactions.
View Article and Find Full Text PDFThere is a growing interest in milk oligosaccharides (MOs) because of their numerous benefits for newborns' and long-term health. A large number of MO structures have been identified in mammalian milk. Mostly described in human milk, the oligosaccharide richness, although less broad, has also been reported for a wide range of mammalian species.
View Article and Find Full Text PDFThis article describes our study on the alignment of two complementary knowledge graphs useful in agriculture: the thesaurus of cultivated plants in France named French Crop Usage (FCU) and the French national taxonomic repository TAXREF for fauna, flora, and fungi. FCU describes the usages of plants in agriculture: "" are crops used for human food, and "" are crops used for human beverage. TAXREF describes biological taxa and associated scientific names: for example, a tomato species may be "" or a grapevine species may be "".
View Article and Find Full Text PDFThe dramatic increase in the number of microbe descriptions in databases, reports, and papers presents a two-fold challenge for accessing the information: integration of heterogeneous data in a standard ontology-based representation and normalization of the textual descriptions by semantic analysis. Recent text mining methods offer powerful ways to extract textual information and generate ontology-based representation. This paper describes the design of the Omnicrobe application that gathers comprehensive information on habitats, phenotypes, and usages of microbes from scientific sources of high interest to the microbiology community.
View Article and Find Full Text PDFCollecting relations between chemicals and drugs is crucial in biomedical research. The pre-trained transformer model, e.g.
View Article and Find Full Text PDFBackground: Entity normalization is an important information extraction task which has gained renewed attention in the last decade, particularly in the biomedical and life science domains. In these domains, and more generally in all specialized domains, this task is still challenging for the latest machine learning-based approaches, which have difficulty handling highly multi-class and few-shot learning problems. To address this issue, we propose C-Norm, a new neural approach which synergistically combines standard and weak supervision, ontological knowledge integration and distributional semantics.
View Article and Find Full Text PDFPhenotyping is a major issue for wheat agriculture to meet the challenges of adaptation of wheat varieties to climate change and chemical input reduction in crop. The need to improve the reuse of observations and experimental data has led to the creation of reference ontologies to standardize descriptions of phenotypes and to facilitate their comparison. The scientific literature is largely under-exploited, although extremely rich in phenotype descriptions associated with cultivars and genetic information.
View Article and Find Full Text PDFEntity normalization, or entity linking in the general domain, is an information extraction task that aims to annotate/bind multiple words/expressions in raw text with semantic references, such as concepts of an ontology. An ontology consists minimally of a formally organized vocabulary or hierarchy of terms, which captures knowledge of a domain. Presently, machine-learning methods, often coupled with distributional representations, achieve good performance.
View Article and Find Full Text PDFInformation on food microbial diversity is scattered across millions of scientific papers. Researchers need tools to assist their bibliographic search in such large collections. Text mining and knowledge engineering methods are useful to automatically and efficiently find relevant information in Life Science.
View Article and Find Full Text PDFText mining is a powerful technology for quickly distilling key information from vast quantities of biomedical literature. However, to harness this power the researcher must be well versed in the availability, suitability, adaptability, interoperability and comparative accuracy of current text mining resources. In this survey, we give an overview of the text mining resources that exist in the life sciences to help researchers, especially those employed in biocuration, to engage with text mining in their own work.
View Article and Find Full Text PDFBackground: We present the two Bacteria Track tasks of BioNLP 2013 Shared Task (ST): Gene Regulation Network (GRN) and Bacteria Biotope (BB). These tasks were previously introduced in the 2011 BioNLP-ST Bacteria Track as Bacteria Gene Interaction (BI) and Bacteria Biotope (BB). The Bacteria Track was motivated by a need to develop specific BioNLP tools for fine-grained event extraction in bacteria biology.
View Article and Find Full Text PDFBackground: We present the BioNLP 2011 Shared Task Bacteria Track, the first Information Extraction challenge entirely dedicated to bacteria. It includes three tasks that cover different levels of biological knowledge. The Bacteria Gene Renaming supporting task is aimed at extracting gene renaming and gene name synonymy in PubMed abstracts.
View Article and Find Full Text PDFWe report here the complete genome sequence of the virulent strain JIP02/86 (ATCC 49511) of Flavobacterium psychrophilum, a widely distributed pathogen of wild and cultured salmonid fish. The genome consists of a 2,861,988-base pair (bp) circular chromosome with 2,432 predicted protein-coding genes. Among these predicted proteins, stress response mediators, gliding motility proteins, adhesins and many putative secreted proteases are probably involved in colonization, invasion and destruction of the host tissues.
View Article and Find Full Text PDFLactobacillus sakei is a psychotrophic lactic acid bacterium found naturally on fresh meat and fish. This microorganism is widely used in the manufacture of fermented meats and has biotechnological potential in biopreservation and food safety. We have explored the 1,884,661-base-pair (bp) circular chromosome of strain 23K encoding 1,883 predicted genes.
View Article and Find Full Text PDF