Transposable elements (TEs) and tandem repeat arrays are ubiquitous components of genomes across all domains of life. Many types of repetitive DNA do not appear to encode for functional proteins, and those that do, typically only code for enzymes involved in their own replication. Nevertheless, repetitive DNA sequences can significantly alter genome structure, and can have a profound impact on an organism's biology at both the molecular and organismal levels. Advances in long-read sequencing technology have enabled the resolution of previously collapsed contigs and scaffolds that are rich in repeats, which has made the accurate annotation of TEs and other repetitive sequences a crucial early step in genome analysis. Here, we provide a detailed tutorial for streamlined annotation of TEs and repeats in the genome of the model plant (maize). Maize is ideally suited to illustrate these procedures due to its repeat-rich genome and the volume of publicly available and high-quality genomic resources. We outline four possible approaches for TE and repeat annotation, each aimed at accommodating a different set of scientific interests. Additionally, we demonstrate how to evaluate annotation quality, and provide scripts to help graphically depict TE and repeat landscapes. Although the protocol is tailored for maize, we also offer pointers for researchers working on other systems throughout and expect that these procedures will be broadly applicable to any eukaryotic genome.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1101/pdb.prot108578 | DOI Listing |
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!