Objectives: Identifying orthology relationships among sequences is essential to understand evolution, diversity of life and ancestry among organisms. To build alignments of orthologous sequences, phylogenomic pipelines often start with all-vs-all similarity searches, followed by a clustering step. For the protein clusters (orthogroups) to be as accurate as possible, proteomes of good quality are needed. Here, our objective is to assemble a data set especially suited for the phylogenomic study of algae and formerly photosynthetic eukaryotes, which implies the proper integration of organellar data, to enable distinguishing between several copies of one gene (paralogs), taking into account their cellular compartment, if necessary.

Data Description: We submitted 73 top-quality and taxonomically diverse proteomes to OrthoFinder. We obtained 47,266 orthogroups and identified 11,775 orthogroups with at least two algae. Whenever possible, sequences were functionally annotated with eggNOG and tagged after their genomic and target compartment(s). Then we aligned and computed phylogenetic trees for the orthogroups with IQ-TREE. Finally, these trees were further processed by identifying and pruning the subtrees exclusively composed of plastid-bearing organisms to yield a set of 31,784 clans suitable for studying photosynthetic organism genome evolution.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8052839PMC
http://dx.doi.org/10.1186/s13104-021-05553-4DOI Listing

Publication Analysis

Top Keywords

broadly sampled
4
sampled orthologous
4
orthologous groups
4
groups eukaryotic
4
eukaryotic proteins
4
proteins phylogenetic
4
phylogenetic study
4
study plastid-bearing
4
plastid-bearing lineages
4
lineages objectives
4

Similar Publications

Background: Sugarcane is cultivated globally and affected by more than 125 pathogens, which lead to various plant diseases. In recent years, high-throughput sequencing (HTS)-based genome analyses have been broadly adopted for the discovery of both characterized and un-characterized viruses from plant samples. In this study, the HTS data of sugarcane pooled sample retrieved from sequence read archive (SRA) were de novo re-assembled using CLC Genomic Workbench.

View Article and Find Full Text PDF

Introduction: Newborn screening (NBS) programs for a defined set of eligible diseases have been enormously successful, but genomic NBS allowing for detection of additional treatable disorders has not been broadly implemented. All 3 types of primary hyperoxaluria (PH1-3) are rare autosomal recessive diseases caused by distinct defects of glyoxylate metabolism that are diagnosed genetically with certainty. Early diagnosis and treatment are mandatory to avoid renal failure or sequalae associated with persistent hyperoxaluria.

View Article and Find Full Text PDF

Slight thermal stress exerts genetic diversity selection at coral (Acropora digitifera) larval stages.

BMC Genomics

January 2025

Sesoko Marine Station, Tropical Biosphere Research Center, University of the Ryukyus, 3422 Sesoko, Motobu, Okinawa, 905-0227, Japan.

Background: Rising seawater temperatures increasingly threaten coral reefs. The ability of coral larvae to withstand heat is crucial for maintaining reef ecosystems. Although several studies have investigated coral larvae's genetic responses to thermal stress, most relied on pooled sample sequencing, which provides population-level insights but may mask individual genotype variability.

View Article and Find Full Text PDF

Marine foundation species are increasingly impacted by anthropogenic stressors, driving a loss of diversity within these critical habitats. Prior studies suggest that species diversity within mussel beds has declined precipitously in southern California, USA, but it is unclear whether a similar loss has occurred farther north. Here, we resurvey a mussel bed community in northern California first sampled in 1941 to evaluate changes in diversity after 78 years.

View Article and Find Full Text PDF

Recent emphasis on the development of safe-and-sustainable-by-design chemicals highlights the need for methods facilitating the early assessment of persistence. Activated sludge experiments have been proposed as a time- and resource-efficient way to predict half-lives in simulation studies. Here, this persistence "read-across" approach was developed to be more broadly and robustly applicable.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!