The Planteome project (https://planteome.org/) provides a suite of reference and crop-specific ontologies and an integrated knowledgebase of plant genomics data. The plant genomics data in the Planteome has been obtained through manual and automated curation and sourced from more than 40 partner databases and resources.
View Article and Find Full Text PDFOver the last couple of decades, there has been a rapid growth in the number and scope of agricultural genetics, genomics and breeding databases and resources. The AgBioData Consortium (https://www.agbiodata.
View Article and Find Full Text PDFIntroduction: Climate change is already affecting ecosystems around the world and forcing us to adapt to meet societal needs. The speed with which climate change is progressing necessitates a massive scaling up of the number of species with understood genotype-environment-phenotype (G×E×P) dynamics in order to increase ecosystem and agriculture resilience. An important part of predicting phenotype is understanding the complex gene regulatory networks present in organisms.
View Article and Find Full Text PDFThe Gene Ontology (GO) knowledgebase (http://geneontology.org) is a comprehensive resource concerning the functions of genes and gene products (proteins and noncoding RNAs). GO annotations cover genes from organisms across the tree of life as well as viruses, though most gene function knowledge currently derives from experiments carried out in a relatively small number of model organisms.
View Article and Find Full Text PDFHeterogeneous and multidisciplinary data generated by research on sustainable global agriculture and agrifood systems requires quality data labeling or annotation in order to be interoperable. As recommended by the FAIR principles, data, labels, and metadata must use controlled vocabularies and ontologies that are popular in the knowledge domain and commonly used by the community. Despite the existence of robust ontologies in the Life Sciences, there is currently no comprehensive full set of ontologies recommended for data annotation across agricultural research disciplines.
View Article and Find Full Text PDFThe Plant Ontology (PO) is a community resource consisting of standardized terms, definitions, and logical relations describing plant structures and development stages, augmented by a large database of annotations from genomic and phenomic studies. This paper describes the structure of the ontology and the design principles we used in constructing PO terms for plant development stages. It also provides details of the methodology and rationale behind our revision and expansion of the PO to cover development stages for all plants, particularly the land plants (bryophytes through angiosperms).
View Article and Find Full Text PDFThe future of agricultural research depends on data. The sheer volume of agricultural biological data being produced today makes excellent data management essential. Governmental agencies, publishers and science funders require data management plans for publicly funded research.
View Article and Find Full Text PDFIn this article, we present a joint effort of the wheat research community, along with data and ontology experts, to develop wheat data interoperability guidelines. Interoperability is the ability of two or more systems and devices to cooperate and exchange data, and interpret that shared information. Interoperability is a growing concern to the wheat scientific community, and agriculture in general, as the need to interpret the deluge of data obtained through high-throughput technologies grows.
View Article and Find Full Text PDFThe Planteome project (http://www.planteome.org) provides a suite of reference and species-specific ontologies for plants and annotations to genes and phenotypes.
View Article and Find Full Text PDFFully automated text mining (TM) systems promote efficient literature searching, retrieval, and review but are not sufficient to produce ready-to-consume curated documents. These systems are not meant to replace biocurators, but instead to assist them in one or more literature curation steps. To do so, the user interface is an important aspect that needs to be considered for tool adoption.
View Article and Find Full Text PDFUnderstanding the interplay between environmental conditions and phenotypes is a fundamental goal of biology. Unfortunately, data that include observations on phenotype and environment are highly heterogeneous and thus difficult to find and integrate. One approach that is likely to improve the status quo involves the use of ontologies to standardize and link data about phenotypes and environments.
View Article and Find Full Text PDFThe use of controlled, structured vocabularies (ontologies) has become a critical tool for scientists in the post-genomic era of massive datasets. Adoption and integration of common vocabularies and annotation practices enables cross-species comparative analyses and increases data sharing and reusability. The Plant Ontology (PO; http://www.
View Article and Find Full Text PDFBarley (Hordeum vulgare L.) possesses a large and highly repetitive genome of 5.1 Gb that has hindered the development of a complete sequence.
View Article and Find Full Text PDFThe seed oil of meadowfoam, a new crop in the Limnanthaceae family, is highly enriched in very long chain fatty acids that are desaturated at the Δ5 position. The unusual oil is desirable for cosmetics and innovative industrial applications and the seed meal remaining after oil extraction contains glucolimnanthin, a methoxylated benzylglucosinolate whose degradation products are herbicidal and anti-microbial. Here we describe EST analysis of the developing seed transcriptome that identified major genes involved in biosynthesis and assembly of the seed oil and in glucosinolate metabolic pathways.
View Article and Find Full Text PDFBackground: Plant phenotype datasets include many different types of data, formats, and terms from specialized vocabularies. Because these datasets were designed for different audiences, they frequently contain language and details tailored to investigators with different research objectives and backgrounds. Although phenotype comparisons across datasets have long been possible on a small scale, comprehensive queries and analyses that span a broad set of reference species, research disciplines, and knowledge domains continue to be severely limited by the absence of a common semantic framework.
View Article and Find Full Text PDFBackground: Large quantities of digital images are now generated for biological collections, including those developed in projects premised on the high-throughput screening of genome-phenome experiments. These images often carry annotations on taxonomy and observable features, such as anatomical structures and phenotype variations often recorded in response to the environmental factors under which the organisms were sampled. At present, most of these annotations are described in free text, may involve limited use of non-standard vocabularies, and rarely specify precise coordinates of features on the image plane such that a computer vision algorithm could identify, extract and annotate them.
View Article and Find Full Text PDFDespite a large and multifaceted effort to understand the vast landscape of phenotypic data, their current form inhibits productive data analysis. The lack of a community-wide, consensus-based, human- and machine-interpretable language for describing phenotypes and their genomic and environmental contexts is perhaps the most pressing scientific bottleneck to integration across many key fields in biology, including genomics, systems biology, development, medicine, evolution, ecology, and systematics. Here we survey the current phenomics landscape, including data resources and handling, and the progress that has been made to accurately capture relevant data descriptions for phenotypes.
View Article and Find Full Text PDFIn many databases, biocuration primarily involves literature curation, which usually involves retrieving relevant articles, extracting information that will translate into annotations and identifying new incoming literature. As the volume of biological literature increases, the use of text mining to assist in biocuration becomes increasingly relevant. A number of groups have developed tools for text mining from a computer science/linguistics perspective, and there are many initiatives to curate some aspect of biology from the literature.
View Article and Find Full Text PDFThe Plant Ontology (PO; http://www.plantontology.org/) is a publicly available, collaborative effort to develop and maintain a controlled, structured vocabulary ('ontology') of terms to describe plant anatomy, morphology and the stages of plant development.
View Article and Find Full Text PDFWormBase, dictyBase and The Arabidopsis Information Resource (TAIR) are model organism databases containing information about Caenorhabditis elegans and other nematodes, the social amoeba Dictyostelium discoideum and related Dictyostelids and the flowering plant Arabidopsis thaliana, respectively. Each database curates multiple data types from the primary research literature. In this article, we describe the curation workflow at WormBase, with particular emphasis on our use of text-mining tools (BioCreative 2012, Workshop Track II).
View Article and Find Full Text PDFPremise Of The Study: Bio-ontologies are essential tools for accessing and analyzing the rapidly growing pool of plant genomic and phenomic data. Ontologies provide structured vocabularies to support consistent aggregation of data and a semantic framework for automated analyses and reasoning. They are a key component of the semantic web.
View Article and Find Full Text PDFThe expression level of bar, which encodes phosphinothricin acetyltransferase (PAT), was correlated with the inviability of barley hybrids between 20 Golden Promise-derived transgenic lines (Ds-bar lines) and a specialized genetic marker stock, Oregon Wolfe Barley Dominant (OWBD). Each Ds-bar line was homozygous for a modified maize Ds element that encoded bar and that had been delivered via transposition to a unique location. All Ds-bar lines were viable and morphologically similar.
View Article and Find Full Text PDFTransposable elements have certain advantages over other approaches for identifying and determining gene function in large genome cereals. Different strategies have been used to exploit the maize Activator/dissociation (Ac/Ds) transposon system for functional genomics in heterologous species. Either large numbers of independent Ds insertion lines or transposants (TNPs) are generated and screened phenotypically, or smaller numbers of TNPs are produced, Ds locations mapped and remobilized for localized gene targeting.
View Article and Find Full Text PDF