58 results match your criteria: "Institute for Theoretical Informatics[Affiliation]"

Eleven grand challenges in single-cell data science.

Genome Biol

February 2020

Life Sciences and Health, Centrum Wiskunde & Informatica, Amsterdam, The Netherlands.

Article Synopsis
  • Advances in microfluidics and low sequencing costs have revolutionized single-cell sequencing technology, allowing for the analysis of thousands to millions of cells in one experiment.
  • This rapid data generation presents unique challenges in data science, which the text identifies as central to the future of single-cell biology.
  • The article provides an overview of eleven key challenges, including motivating research questions and open problems, making it relevant for both experienced researchers and newcomers to the field.
View Article and Find Full Text PDF
Article Synopsis
  • Genesis is a library designed for handling phylogenetic data, while gappa is a command-line tool for performing common analyses on that data.
  • Both tools support various data types like phylogenetic trees, sequences, and taxonomies, providing ease of use and customization.
  • They are developed in C++11 and available for free under GPLv3, with additional resources accessible online.
View Article and Find Full Text PDF
Article Synopsis
  • Researchers created a faster and more memory-efficient version of the transfer bootstrap expectation (TBE) method for phylogenetic analysis, addressing limitations of the original, resource-heavy tool.
  • Their new implementation can be up to 480 times quicker and uses significantly less memory, making it better for large datasets.
  • This optimized TBE method has been integrated into existing tools and is available for public use under an open-source license.
View Article and Find Full Text PDF

Automated, phylogeny-based genotype delimitation of the Hepatitis Viruses HBV and HCV.

PeerJ

October 2019

Centre for Life's Origins and Evolution, Department of Genetics Evolution and Environment, University College London, University of London, London, United Kingdom.

Article Synopsis
  • The classification of hepatitis viruses is currently based on subjective criteria, making it difficult to adapt as new strains are discovered; this study explores using computational methods for better classification.* -
  • By applying molecular species delimitation techniques from biodiversity studies to hepatitis virus datasets (HBV and HCV), researchers found that the mPTP tool effectively identified taxonomic clusters aligned with known genotypes and subtypes.* -
  • The ABGD tool provided variable results based on different parameters but also showed potential for accurately classifying viruses, suggesting that automated methods can enhance hepatitis virus taxonomy.*
View Article and Find Full Text PDF
Article Synopsis
  • The rise of molecular sequencing technology has drastically lowered costs, resulting in vast amounts of data needing scalable analysis methods, particularly in phylogenetic placement for metagenomic studies.
  • The authors present new scalable techniques for analyzing metagenomic samples, which include visualizing sample differences, clustering similar samples, and identifying influential phylogenetic factors.
  • In tests using nearly 10,000 samples and 168 million sequences, the new methods showed their effectiveness in uncovering significant biological insights.
View Article and Find Full Text PDF
Article Synopsis
  • Researchers developed a new model for amino acid sequence evolution that incorporates protein structure, which is often overlooked despite its importance.
  • This "structurally aware" model uses an expanded alphabet to describe amino acids along with their side-chain configurations, taking into account geometric patterns and dihedral angles.
  • The new model outperforms traditional models in estimating evolutionary divergence and reconstructing ancestral states, highlighting the significance of side-chain geometry for understanding protein folding and function in evolutionary biology.
View Article and Find Full Text PDF

Positive effects of aesthetically appreciated designs have long been studied and confirmed since the 19th century: such designs are more enjoyable, they are more forgivable for glitches and can increase users' performance. In the field of information visualization, studies of aesthetics are still a niche approach. In the current study, we aim to specifically understand which parameters in a visualization of node-link diagrams make them aesthetically pleasing-an important extension to already existing research on usability and readability aspects.

View Article and Find Full Text PDF
Article Synopsis
  • The study discusses challenges in metagenomic sequencing, particularly in analyzing evolutionary relationships of sequences, highlighting difficulties like manual reference selection and high computational costs.
  • The authors propose new algorithms to automate the construction of reference sequences, enable large-scale phylogenetic placements, and manage extensive sequence datasets effectively.
  • Their methods have been tested on real data, demonstrating significant improvements in processing speed and accuracy, and the tools are openly available for public use.
View Article and Find Full Text PDF

Two C++ libraries for counting trees on a phylogenetic terrace.

Bioinformatics

October 2018

Institute for Theoretical Informatics, Karlsruhe Institute of Technology, Karlsruhe, Germany.

Motivation: The presence of terraces in phylogenetic tree space, i.e. a potentially large number of distinct tree topologies that have exactly the same analytical likelihood score, was first described by Sanderson et al.

View Article and Find Full Text PDF

Multicomponent reactions provide key molecules for secret communication.

Nat Commun

April 2018

Laboratory of Applied Chemistry, Institute of Organic Chemistry (IOC), Karlsruhe Institute of Technology (KIT), Straße am Forum 7, Karlsruhe, 76131, Germany.

A convenient and inherently more secure communication channel for encoding messages via specifically designed molecular keys is introduced by combining advanced encryption standard cryptography with molecular steganography. The necessary molecular keys require large structural diversity, thus suggesting the application of multicomponent reactions. Herein, the Ugi four-component reaction of perfluorinated acids is utilized to establish an exemplary database consisting of 130 commercially available components.

View Article and Find Full Text PDF
Article Synopsis
  • Creating a comprehensive and sustainable plant tree of life is becoming possible but faces challenges due to issues with current data integration and accessibility for non-experts.* -
  • Existing phylogenetic trees are often static and quickly outdated, highlighting the need for a collaborative and adaptable framework for integrating DNA data and conducting phylogenetic analyses.* -
  • The scientific community should focus on developing user-friendly interfaces for data access, regular updates of phylogenetic trees, and enhancing data quality through user feedback to achieve effective global phylogenetic synthesis.*
View Article and Find Full Text PDF

Resolving complex phylogeographic patterns in the Balkan Peninsula using closely related wall-lizard species as a model system.

Mol Phylogenet Evol

August 2018

Natural History Museum of Crete, School of Sciences and Engineering, University of Crete, Knosos Avenue, Irakleio GR71409, Greece; Department of Biology, School of Sciences and Engineering, University of Crete, Vassilika Vouton, Irakleio GR70013, Greece.

Article Synopsis
  • * Researchers studied wall-lizard species from the region to understand their evolutionary processes using various genetic analysis methods.
  • * Their findings revealed distinct evolutionary lineages and showed that significant climatic and tectonic events influenced species divergence and genetic diversity.
View Article and Find Full Text PDF

MPBoot: fast phylogenetic maximum parsimony tree inference and bootstrap approximation.

BMC Evol Biol

February 2018

Center for Integrative Bioinformatics Vienna, Max F. Perutz Laboratories, University of Vienna, Medical University Vienna, Campus Vienna Biocenter 5, A-1030, Vienna, Austria.

Background: The nonparametric bootstrap is widely used to measure the branch support of phylogenetic trees. However, bootstrapping is computationally expensive and remains a bottleneck in phylogenetic analyses. Recently, an ultrafast bootstrap approximation (UFBoot) approach was proposed for maximum likelihood analyses.

View Article and Find Full Text PDF
Article Synopsis
  • Research reveals that protist communities in Neotropical rainforests are exceptionally diverse and mainly consist of parasitic Apicomplexa, which could be influencing animal population control.
  • The study indicates that while high protist diversity exists, the limited presence of Oomycota does not significantly affect tree diversity in the forests.
  • Findings suggest that protists may actually be more diverse than arthropods in these ecosystems, highlighting their important role in tropical ecosystems traditionally thought to be dominated by larger organisms.
View Article and Find Full Text PDF

Background: sgnesR (Stochastic Gene Network Expression Simulator in R) is an R package that provides an interface to simulate gene expression data from a given gene network using the stochastic simulation algorithm (SSA). The package allows various options for delay parameters and can easily included in reactions for promoter delay, RNA delay and Protein delay. A user can tune these parameters to model various types of reactions within a cell.

View Article and Find Full Text PDF

To characterize HIV-1 transmission dynamics in regions where the burden of HIV-1 is greatest, the 'Phylogenetics and Networks for Generalised HIV Epidemics in Africa' consortium (PANGEA-HIV) is sequencing full-genome viral isolates from across sub-Saharan Africa. We report the first 3,985 PANGEA-HIV consensus sequences from four cohort sites (Rakai Community Cohort Study, n=2,833; MRC/UVRI Uganda, n=701; Mochudi Prevention Project, n=359; Africa Health Research Institute Resistance Cohort, n=92). Next-generation sequencing success rates varied: more than 80% of the viral genome from the gag to the nef genes could be determined for all sequences from South Africa, 75% of sequences from Mochudi, 60% of sequences from MRC/UVRI Uganda, and 22% of sequences from Rakai.

View Article and Find Full Text PDF
Article Synopsis
  • Phylogenetic trees are used to visualize the evolutionary relationships of species, but incorrect interpretations can arise from how branch support values are represented.
  • The Newick tree format often stores these values as node labels instead of branch labels, leading to errors during tree rerooting.
  • A review of 20 tree visualization tools found that 70% do not allow users to select how node label semantics are interpreted, which can result in misleading results; improvements have been made to some tools as a result.
View Article and Find Full Text PDF

UniEuk: Time to Speak a Common Language in Protistology!

J Eukaryot Microbiol

May 2017

Sorbonne Universités UPMC Université Paris 06 & CNRS, UMR7144, Station Biologique de Roscoff, Place Georges Teissier, Roscoff, 29680, France.

Universal taxonomic frameworks have been critical tools to structure the fields of botany, zoology, mycology, and bacteriology as well as their large research communities. Animals, plants, and fungi have relatively solid, stable morpho-taxonomies built over the last three centuries, while bacteria have been classified for the last three decades under a coherent molecular taxonomic framework. By contrast, no such common language exists for microbial eukaryotes, even though environmental '-omics' surveys suggest that protists make up most of the organismal and genetic complexity of our planet's ecosystems! With the current deluge of eukaryotic meta-omics data, we urgently need to build up a universal eukaryotic taxonomy bridging the protist -omics age to the fragile, centuries-old body of classical knowledge that has effectively linked protist taxa to morphological, physiological, and ecological information.

View Article and Find Full Text PDF

Efficient Detection of Repeating Sites to Accelerate Phylogenetic Likelihood Calculations.

Syst Biol

March 2017

Scientific Computing Group, Heidelberg Institute for Theoretical Studies, Schoβ-wolysbronnenweg 35, 69118 Heidelberg, Germany.

The phylogenetic likelihood function (PLF) is the major computational bottleneck in several applications of evolutionary biology such as phylogenetic inference, species delimitation, model selection, and divergence times estimation. Given the alignment, a tree and the evolutionary model parameters, the likelihood function computes the conditional likelihood vectors for every node of the tree. Vector entries for which all input data are identical result in redundant likelihood operations which, in turn, yield identical conditional values.

View Article and Find Full Text PDF

VSEARCH: a versatile open source tool for metagenomics.

PeerJ

October 2016

Department of Ecology, University of Kaiserslautern, Kaiserslautern, Germany; UMR LSTM, CIRAD, Montpellier, France.

Background: VSEARCH is an open source and free of charge multithreaded 64-bit tool for processing and preparing metagenomics, genomics and population genomics nucleotide sequence data. It is designed as an alternative to the widely used USEARCH tool (Edgar, 2010) for which the source code is not publicly available, algorithm details are only rudimentarily described, and only a memory-confined 32-bit version is freely available for academic use.

Methods: When searching nucleotide sequences, VSEARCH uses a fast heuristic based on words shared by the query and target sequences in order to quickly identify similar sequences, a similar strategy is probably used in USEARCH.

View Article and Find Full Text PDF

The Trichoptera barcode initiative: a strategy for generating a species-level Tree of Life.

Philos Trans R Soc Lond B Biol Sci

September 2016

Department of Entomology and Nematology, University of California-Davis, 1282 Academic Surge, Davis, CA 95616, USA

DNA barcoding was intended as a means to provide species-level identifications through associating DNA sequences from unknown specimens to those from curated reference specimens. Although barcodes were not designed for phylogenetics, they can be beneficial to the completion of the Tree of Life. The barcode database for Trichoptera is relatively comprehensive, with data from every family, approximately two-thirds of the genera, and one-third of the described species.

View Article and Find Full Text PDF

Phylogeny-aware identification and correction of taxonomically mislabeled sequences.

Nucleic Acids Res

June 2016

The Exelixis Lab, Scientific Computing Group, Heidelberg Institute for Theoretical Studies, Schloss-Wolfsbrunnenweg 35, 69118 Heidelberg, Germany Karlsruhe Institute of Technology, Institute for Theoretical Informatics, Postfach 6980, 76128 Karlsruhe, Germany.

Article Synopsis
  • - Public databases often contain molecular sequences annotated by the original authors, leading to potential mislabeling and errors that are difficult to detect, which can negatively impact metagenetic studies.
  • - The research introduces SATIVA, a method that uses phylogenetic analysis to automatically identify and correct taxonomically mislabeled sequences, achieving high accuracy rates in both identification and correction.
  • - Analysis of popular microbial reference databases reveals a significant presence of mislabels, ranging from 0.2% to 2.5%, and SATIVA provides a tool for exploring better taxonomic classifications, specifically for Cyanobacteria.
View Article and Find Full Text PDF

Background: It is generally acknowledged that a functional understanding of a biological system can only be obtained by an understanding of the collective of molecular interactions in form of biological networks. Protein networks are one particular network type of special importance, because proteins form the functional base units of every biological cell. On a mesoscopic level of protein networks, modules are of significant importance because these building blocks may be the next elementary functional level above individual proteins allowing to gain insight into fundamental organizational principles of biological cells.

View Article and Find Full Text PDF

Computing the Internode Certainty and Related Measures from Partial Gene Trees.

Mol Biol Evol

June 2016

Heidelberg Institute for Theoretical Studies, Heidelberg, Germany Institute for Theoretical Informatics, Karlsruhe Institute of Technology, Postfach 6980, Karlsruhe, 76128, Germany.

We present, implement, and evaluate an approach to calculate the internode certainty (IC) and tree certainty (TC) on a given reference tree from a collection of partial gene trees. Previously, the calculation of these values was only possible from a collection of gene trees with exactly the same taxon set as the reference tree. An application to sets of partial gene trees requires mathematical corrections in the IC and TC calculations.

View Article and Find Full Text PDF