Publications by authors named "Cassandria Tay Fernandez"

Summary: Genome-wide association studies (GWAS) excels at harnessing dense genomic variant datasets to identify candidate regions responsible for producing a given phenotype. However, GWAS and traditional fine-mapping methods do not provide insight into the complex local landscape of linkage that contains and has been shaped by the causal variant(s). Here, we present crosshap, an R package that performs robust density-based clustering of variants based on their linkage profiles to capture haplotype structures in a local genomic region of interest.

View Article and Find Full Text PDF

Many genome annotations include false-positive gene models, leading to errors in phylogenetic and comparative studies. Here, we propose a method to support gene model prediction based on evolutionary conservation and use it to identify potentially erroneous annotations. Using this method, we developed a set of 15,345 representative gene models from 12 legume assemblies that can be used to support genome annotations for other legumes.

View Article and Find Full Text PDF

Pod dehiscence is a major source of yield loss in legumes, which is exacerbated by aridity. Disruptive mutations in "Pod indehiscent 1" (PDH1), a pod sclerenchyma-specific lignin biosynthesis gene, has been linked to significant reductions in dehiscence in several legume species. We compared syntenic PDH1 regions across 12 legumes and two outgroups to uncover key historical evolutionary trends at this important locus.

View Article and Find Full Text PDF

Rye () is a climate-resilient cereal grown extensively as grain or forage crop in Northern and Eastern Europe. In addition to being an important crop, it has been used to improve wheat through introgression of genomic regions for improved yield and disease resistance. Understanding the genomic diversity of rye will assist both the improvement of this crop and facilitate the introgression of more valuable traits into wheat.

View Article and Find Full Text PDF

During crop domestication and breeding, wild plant species have been shaped into modern high-yield crops and adapted to the main agro-ecological regions. However, climate change will impact crop productivity in these regions, and agriculture needs to adapt to support future food production. On a global scale, crop wild relatives grow in more diverse environments than crop species, and so may host genes that could support the adaptation of crops to new and variable environments.

View Article and Find Full Text PDF

Presence-absence variants (PAV) are genomic regions present in some individuals of a species, but not others. PAVs have been shown to contribute to genomic diversity, especially in bacteria and plants. These structural variations have been linked to traits and can be used to track a species' evolutionary history.

View Article and Find Full Text PDF

Gene models are regions of the genome that can be transcribed into RNA and translated to proteins, or belong to a class of non-coding RNA genes. The prediction of gene models is a complex process that can be unreliable, leading to false positive annotations. To help support the calling of confident conserved gene models and minimize false positives arising during gene model prediction we have developed Truegene, a machine learning approach to classify potential low confidence gene models using 14 gene and 41 protein-based characteristics.

View Article and Find Full Text PDF

Pangenomes are a rich resource to examine the genomic variation observed within a species or genera, supporting population genetics studies, with applications for the improvement of crop traits. Major crop species such as maize (), rice (, ( spp.), and soybean () have had pangenomes constructed and released, and this has led to the discovery of valuable genes associated with disease resistance and yield components.

View Article and Find Full Text PDF

Pangenomes aim to represent the complete repertoire of the genome diversity present within a species or cohort of species, capturing the genomic structural variance between individuals. This genomic information coupled with phenotypic data can be applied to identify genes and alleles involved with abiotic stress tolerance, disease resistance, and other desirable traits. The characterisation of novel structural variants from pangenomes can support genome editing approaches such as Clustered Regularly Interspaced Short Palindromic Repeats and CRISPR associated protein Cas (CRISPR-Cas), providing functional information on gene sequences and new target sites in variant-specific genes with increased efficiency.

View Article and Find Full Text PDF

Pangenomes have replaced single reference genomes as genetic references, as they contain a better scope of the diversity found in a single species. This protocol outlines the iterative mapping approach in constructing a pangenome, including how to check the raw data, align the data to a reference, how to assemble the data, and how to remove potential contaminants from the final assembly.

View Article and Find Full Text PDF

Molecular markers provide researchers with a powerful tool for variation analysis between plant genomes. They are heritable and widely distributed across the genome and for this reason have many applications in plant taxonomy and genotyping. Over the last decade, molecular marker technology has developed rapidly and is now a crucial component for genetic linkage analysis, trait mapping, diversity analysis, and association studies.

View Article and Find Full Text PDF

With the assembly of increasing numbers of plant genomes, it is becoming accepted that a single reference assembly does not reflect the gene diversity of a species. The production of pangenomes, which reflect the structural variation and polymorphisms in genomes, enables in depth comparisons of variation within species or higher taxonomic groups. In this review, we discuss the current and emerging approaches for pangenome assembly, analysis and visualisation.

View Article and Find Full Text PDF

Genome-wide association studies (GWAS) are a valuable approach to identify single nucleotide polymorphisms (SNPs) associated with a phenotype of interest. There are now a variety of R-packages and command line tools available to perform GWAS. Here, we provide an example downloading and filtering SNP data, followed by GWAS analysis using the R-package rMVP.

View Article and Find Full Text PDF