We present a novel approach to genome-wide association studies (GWAS) by leveraging unstructured, spoken phenotypic descriptions to identify genomic regions associated with maize traits. Utilizing the Wisconsin Diversity panel, we collected spoken descriptions of Zea mays ssp. mays traits, converting these qualitative observations into quantitative data amenable to GWAS analysis.
View Article and Find Full Text PDFObjectives: Phenotyping plants in a field environment can involve a variety of methods including the use of automated instruments and labor-intensive manual measurement and scoring. Researchers also collect language-based phenotypic descriptions and use controlled vocabularies and structures such as ontologies to enable computation on descriptive phenotype data, including methods to determine phenotypic similarities. In this study, spoken descriptions of plants were collected and observers were instructed to use their own vocabulary to describe plant features that were present and visible.
View Article and Find Full Text PDFObjectives: We annotated the latest published sequences of the 26 Zea mays Nested Association Mapping (NAM) founder lines using GOMAP, the Gene Ontology Meta Annotator for Plants. The maize NAM panel enables researchers to understand and identify the genetic basis of complex traits. Annotations of predicted functions for genes can help researchers investigate gene-phenotype associations, prioritize candidate genes for phenotypes of interest, and formulate testable hypotheses about gene function/phenotype associations.
View Article and Find Full Text PDFBackground: Genome-wide gene function annotations are useful for hypothesis generation and for prioritizing candidate genes potentially responsible for phenotypes of interest. We functionally annotated the genes of 18 crop plant genomes across 14 species using the GOMAP pipeline.
Results: By comparison to existing GO annotation datasets, GOMAP-generated datasets cover more genes, contain more GO terms, and are similar in quality (based on precision and recall metrics using existing gold standards as the basis for comparison).
Many newly observed phenotypes are first described, then experimentally manipulated. These language-based descriptions appear in both the literature and in community datastores. To standardize phenotypic descriptions and enable simple data aggregation and analysis, controlled vocabularies and specific data architectures have been developed.
View Article and Find Full Text PDF