Publications by authors named "Nadia El-Mabrouk"

Reconciling a non-binary gene tree with a binary species tree can be done efficiently in the absence of horizontal gene transfers, but becomes NP-hard in the presence of gene transfers. Here, we focus on the special case of endosymbiotic gene transfers (EGT), i.e.

View Article and Find Full Text PDF

A large variety of pairwise measures of similarity or dissimilarity have been developed for comparing phylogenetic trees, for example, species trees or gene trees. Due to its intuitive definition in terms of tree clades and bipartitions and its computational efficiency, the Robinson-Foulds (RF) distance is the most widely used for trees with unweighted edges and labels restricted to leaves (representing the genetic elements being compared). However, in the case of gene trees, an important information revealing the nature of the homologous relation between gene pairs (orthologs, paralogs, and xenologs) is the type of event associated to each internal node of the tree, typically speciations or duplications, but other types of events may also be considered, such as horizontal gene transfers.

View Article and Find Full Text PDF
Article Synopsis
  • All mitochondria are believed to have originated from an ancient endosymbiotic event where an α-proteobacterium merged with a eukaryotic ancestor, leading to significant gene transfers from mitochondria to the nucleus and a reduction in mitochondrial genome size over time.
  • * In some species, like land plants, there is ongoing Endosymbiotic Gene Transfer (EGT) which showcases a varied gene distribution between nuclear and mitochondrial genomes, providing insights into eukaryotic evolution.
  • * The study introduces an algorithm, EndoRex, for tracing the evolution of gene families through duplication, loss, and EGT events, with implementations available on GitHub for further research.*
View Article and Find Full Text PDF

Background: The Robinson-Foulds (RF) distance is a well-established measure between phylogenetic trees. Despite a lack of biological justification, it has the advantages of being a proper metric and being computable in linear time. For phylogenetic applications involving genes, however, a crucial aspect of the trees ignored by the RF metric is the type of the branching event (e.

View Article and Find Full Text PDF
Article Synopsis
  • The classical gene and species tree reconciliation typically assumes that gene families evolve independently, which is valid for distant genes but not for those in syntenic blocks that may evolve together.
  • The study introduces a challenge of deducing the history of segmental duplication and loss for neighboring genes, expanding the traditional Duplication-Loss reconciliation to accommodate multiple gene trees.
  • The research demonstrates that reconstructing a most parsimonious Super-Reconciliation is NP-hard, offers an exponential-time algorithm to solve it, and presents a polynomial-time algorithm for a rearrangement-inclusive model focused on minimizing duplication and loss events.
View Article and Find Full Text PDF
Article Synopsis
  • The study explores changes in the mitochondrial genetic code of green plants, focusing on the evolution of codon assignments, particularly in chlorophyte algae.
  • Researchers analyzed mtDNAs from 51 species, confirming known stop-to-sense reassignments and identifying new sense-to-sense reassignments, such as AGG codons being decoded as alanine instead of arginine.
  • Using the CoreTracker framework, the study also examines tRNA evolution to understand how these genetic code modifications occur through gene duplication and structural changes.
View Article and Find Full Text PDF
Article Synopsis
  • Several methods exist for gene tree reconstruction, but they often lose fit to sequence data during reconciliation with species trees.
  • GATC is a new flexible algorithm that reconstructs binary gene trees with branch lengths, combining tree likelihood and reconciliation scores to improve accuracy.
  • The algorithm has been tested on simulated and empirical datasets, demonstrating its effectiveness and efficiency compared to other leading methods, and is accessible on GitHub.
View Article and Find Full Text PDF
Article Synopsis
  • This text discusses using the supertree method to reconstruct gene trees from multiple input gene trees, focusing on partial data.
  • The authors develop algorithms to create and correct gene trees, emphasizing reconciliation costs and the preservation of gene relationships such as orthology and paralogy.
  • They demonstrate the effectiveness of these algorithms on Ensembl gene trees, especially in correcting duplication nodes that lack strong support, and provide the C++ source code for public access.
View Article and Find Full Text PDF

Motivation: Codon reassignments have been reported across all domains of life. With the increasing number of sequenced genomes, the development of systematic approaches for genetic code detection is essential for accurate downstream analyses. Three automated prediction tools exist so far: FACIL, GenDecoder and Bagheera; the last two respectively restricted to metazoan mitochondrial genomes and CUG reassignments in yeast nuclear genomes.

View Article and Find Full Text PDF

Background: Given a gene family, the relations between genes (orthology/paralogy), are represented by a relation graph, where edges connect pairs of orthologous genes and "missing" edges represent paralogs. While a gene tree directly induces a relation graph, the converse is not always true. Indeed, a relation graph is not necessarily "satisfiable", i.

View Article and Find Full Text PDF
Article Synopsis
  • Alkaloid accumulation in plants increases under stress, and specific types vary significantly among different plant groups, with Rauvolfioideae being notable for producing a variety of valuable monoterpenoid indole alkaloids (MIAs).
  • Researchers sequenced the genome of Rhazya stricta (a plant in the Apocynaceae family) to explore MIA pathway evolution and compare it with the genome of coffee to understand structural differences.
  • The findings revealed significant gene rearrangements in Rhazya and provided insights into gene family evolution, opening up potential new avenues for enhancing MIA production and discovering additional useful compounds.
View Article and Find Full Text PDF
Article Synopsis
  • Gene trees often struggle with weak support, but these issues can be improved by integrating information from species trees and genomic context.
  • The new method called ProfileNJ enhances existing gene trees by correcting weak branches using species tree data and distance matrices, making it efficient enough to analyze large databases like Ensembl Compara.
  • A web tool called RefineTree, which includes ProfileNJ and other gene tree correction methods, is now available for researchers to access and utilize for genomic analyses.
View Article and Find Full Text PDF

Background: While tree-oriented methods for inferring orthology and paralogy relations between genes are based on reconciling a gene tree with a species tree, many tree-free methods are also available (usually based on sequence similarity). Recently, the link between orthology relations and gene trees has been formally considered from the perspective of reconstructing phylogenies from orthology relations. In this paper, we consider this link from a correction point of view.

View Article and Find Full Text PDF

Combining a set of trees on partial datasets into a single tree is a classical method for inferring large phylogenetic trees. Ideally, the combined tree should display each input partial tree, which is only possible if input trees do not contain contradictory phylogenetic information. The simplest version of the supertree problem is thus to state whether a set of trees is compatible, and if so, construct a tree displaying them all.

View Article and Find Full Text PDF
Article Synopsis
  • OrthoAlign is a new algorithm designed to tackle the gene order alignment problem, specifically for orthologs, by considering key evolutionary events like duplications and rearrangements.
  • When applied to the tRNA gene repertoires of 50 bacteria from the Bacillus genus, it revealed that gene duplications and losses are more common than rearrangements.
  • The study found that the rates of gene duplications and losses in Bacillus are significantly lower than in E. coli, indicating strong selective pressures on the tRNA genes in this genus.
View Article and Find Full Text PDF

Background: A variety of methods based on sequence similarity, reconciliation, synteny or functional characteristics, can be used to infer orthology and paralogy relations between genes of a given gene family  G. But is a given set  C of orthology/paralogy constraints possible, i.e.

View Article and Find Full Text PDF
Article Synopsis
  • - The text discusses a method for comparing gene orders as an alignment issue, integrating both rearrangement and modification of gene content in its evolutionary model.
  • - A new heuristic using dynamic programming is introduced for determining the median of three genomes within a broader phylogenetic context.
  • - The tool called multiOrthoAlign demonstrates high accuracy with both simulated and real data sets, while also improving the efficiency of DupLoCut, an almost exact algorithm that utilizes linear programming for similar problems.
View Article and Find Full Text PDF

Motivation: Large-scale methods for inferring gene trees are error-prone. Correcting gene trees for weakly supported features often results in non-binary trees, i.e.

View Article and Find Full Text PDF
Article Synopsis
  • The study highlights how traditional gene trees can sometimes conflict with evolutionary information like conserved synteny, which can indicate gene relationships.
  • The researchers developed polynomial-time algorithms to adjust these gene trees based on external orthology constraints, aiming to create a modified tree that remains as similar as possible to the original.
  • Validity of these modifications was confirmed through likelihood ratio calculations, showing that the adjusted trees often have comparable statistical validity to the original trees.
View Article and Find Full Text PDF

We present a comparative genomics approach for inferring ancestral genome organization and evolutionary scenarios, based on present-day genomes represented as ordered gene sequences with duplicates. We develop our methodology for a model of evolution restricted to duplication and loss, and then show how to extend it to other content-modifying operations, and to inversions. From a combinatorial point of view, the main consequence of ignoring rearrangements is the possibility of formulating the problem as an alignment problem.

View Article and Find Full Text PDF

Background: The "small phylogeny" problem consists in inferring ancestral genomes associated with each internal node of a phylogenetic tree of a set of extant species. Existing methods can be grouped into two main categories: the distance-based methods aiming at minimizing a total branch length, and the synteny-based (or mapping) methods that first predict a collection of relations between ancestral markers in term of "synteny", and then assemble this collection into a set of Contiguous Ancestral Regions (CARs). The predicted CARs are likely to be more reliable as they are more directly deduced from observed conservations in extant species.

View Article and Find Full Text PDF
Article Synopsis
  • Reconciliation is a method used to understand how gene duplications and losses have occurred by comparing gene trees to species trees, but its effectiveness heavily depends on the accuracy of both trees.
  • This study reveals that even accurate gene trees can lead to incorrect duplication-loss histories due to underlying assumptions about the relationship between gene function and sequence constraints.
  • The authors propose a cautious theoretical framework for inferring these histories and demonstrate its application using the mammalian interleukin-1 gene tree as a case study.
View Article and Find Full Text PDF
Article Synopsis
  • Reconciliation is a method used to infer the evolutionary history of gene families by mapping gene trees onto a known species tree, but relies heavily on the accuracy of the gene trees involved.
  • * The paper introduces a way to preprocess gene trees by identifying "non-apparent duplication" (NAD) vertices, which are indications of misplaced leaves affecting the reconciliation results.
  • * The authors propose a polynomial-time heuristic for both species tree inference and reconciliation, aimed at minimizing the removal of species or leaves to ensure accurate gene tree analysis without NAD vertices.
View Article and Find Full Text PDF