Clustal Omega is a version, completely rewritten and revised in 2011, of the widely used Clustal series of programs for multiple sequence alignment. It can deal with very large numbers (many tens of thousands) of DNA/RNA or protein sequences due to its use of the mBed algorithm for calculating guide-trees. This algorithm allows very large alignment problems to be tackled very quickly, even on personal computers.
View Article and Find Full Text PDFMotivation: Secondary structure prediction accuracy (SSPA) in the QuanTest benchmark can be used to measure accuracy of a multiple sequence alignment. SSPA correlates well with the sum-of-pairs score, if the results are averaged over many alignments but not on an alignment-by-alignment basis. This is due to a sub-optimal selection of reference and non-reference sequences in QuanTest.
View Article and Find Full Text PDFClustal Omega is a widely used package for carrying out multiple sequence alignment. Here, we describe some recent additions to the package and benchmark some alternative ways of making alignments. These benchmarks are based on protein structure comparisons or predictions and include a recently described method based on secondary structure prediction.
View Article and Find Full Text PDFWe present a simple all-solid-state laser source emitting 2.4 W of single-frequency light at 671 nm for laser cooling of lithium atoms. It is based on a diode-pumped solid-state laser, which is frequency doubled in a ppZnO:LN ridge waveguide with an internal doubling efficiency of 54%.
View Article and Find Full Text PDFMotivation: Multiple sequence alignment (MSA) is commonly used to analyze sets of homologous protein or DNA sequences. This has lead to the development of many methods and packages for MSA over the past 30 years. Being able to compare different methods has been problematic and has relied on gold standard benchmark datasets of 'true' alignments or on MSA simulations.
View Article and Find Full Text PDFMotivation: Multiple sequence alignments (MSAs) with large numbers of sequences are now commonplace. However, current multiple alignment benchmarks are ill-suited for testing these types of alignments, as test cases either contain a very small number of sequences or are based purely on simulation rather than empirical data.
Results: We take advantage of recent developments in protein structure prediction methods to create a benchmark (ContTest) for protein MSAs containing many thousands of sequences in each test case and which is based on empirical biological data.
Background: Progressive alignment is the standard approach used to align large numbers of sequences. As with all heuristics, this involves a tradeoff between alignment accuracy and computation time.
Results: We examine this tradeoff and find that, because of a loss of information in the early steps of the approach, the alignments generated by the most common multiple sequence alignment programs are inherently unstable, and simply reversing the order of the sequences in the input file will cause a different alignment to be generated.
In this clinical trial, we investigated the blood glucose (BG)-lowering effects of 30, 60 and 90 mg dextromethorphan (DXM) as well as 100 mg sitagliptin alone versus combinations of DXM and sitagliptin during an oral glucose tolerance test (OGTT) in 20 men with T2DM. The combination of 60 mg DXM plus 100 mg sitagliptin was observed to have the strongest effect in the OGTT. It lowered maximum BG concentrations and increased the baseline-adjusted area under the curve for serum insulin concentrations in the first 30 min of the OGTT (mean ± standard deviation 240 ± 47 mg/dl and 8.
View Article and Find Full Text PDFBackground: Multiple sequence alignments (MSA) are widely used in sequence analysis for a variety of tasks. Outlier sequences can make downstream analyses unreliable or make the alignments less accurate while they are being constructed. This paper describes a simple method for automatically detecting outliers and accompanying software called OD-seq.
View Article and Find Full Text PDFClustal Omega is a package for making multiple sequence alignments of amino acid or nucleotide sequences, quickly and accurately. It is a complete upgrade and rewrite of earlier Clustal programs. This unit describes how to run Clustal Omega interactively from a command line, although it can also be run online from several sites.
View Article and Find Full Text PDFBMC Bioinformatics
October 2014
Background: Guide-trees are used as part of an essential heuristic to enable the calculation of multiple sequence alignments. They have been the focus of much method development but there has been little effort at determining systematically, which guide-trees, if any, give the best alignments. Some guide-tree construction schemes are based on pair-wise distances amongst unaligned sequences.
View Article and Find Full Text PDFProc Natl Acad Sci U S A
July 2014
Guide trees are used to decide the order of sequence alignment in the progressive multiple sequence alignment heuristic. These guide trees are often the limiting factor in making large alignments, and considerable effort has been expended over the years in making these quickly or accurately. In this article we show that, at least for protein families with large numbers of sequences that can be benchmarked with known structures, simple chained guide trees give the most accurate alignments.
View Article and Find Full Text PDFClustal Omega is a completely rewritten and revised version of the widely used Clustal series of programs for multiple sequence alignment. It can deal with very large numbers (many tens of thousands) of DNA/RNA or protein sequences due to its use of the mBED algorithm for calculating guide trees. This algorithm allows very large alignment problems to be tackled very quickly, even on personal computers.
View Article and Find Full Text PDFWe present an all-solid-state laser source emitting up to 2.1 W of single-frequency light at 671 nm developed for laser cooling of lithium atoms. It is based on a diode-pumped, neodymium-doped orthovanadate (Nd:YVO(4)) ring laser operating at 1342 nm.
View Article and Find Full Text PDFMotivation: Recent developments in sequence alignment software have made possible multiple sequence alignments (MSAs) of >100 000 sequences in reasonable times. At present, there are no systematic analyses concerning the scalability of the alignment quality as the number of aligned sequences is increased.
Results: We benchmarked a wide range of widely used MSA packages using a selection of protein families with some known structures and found that the accuracy of such alignments decreases markedly as the number of sequences grows.
Multiple sequence alignments are fundamental to many sequence analysis methods. Most alignments are computed using the progressive alignment heuristic. These methods are starting to become a bottleneck in some analysis pipelines when faced with data sets of the size of many thousands of sequences.
View Article and Find Full Text PDFBackground: More and more nucleotide sequences of type A influenza virus are available in public databases. Although these sequences have been the focus of many molecular epidemiological and phylogenetic analyses, most studies only deal with a few representative sequences. In this paper, we present a complete analysis of all Haemagglutinin (HA) and Neuraminidase (NA) gene sequences available to allow large scale analyses of the evolution and epidemiology of type A influenza.
View Article and Find Full Text PDFBackground: The most widely used multiple sequence alignment methods require sequences to be clustered as an initial step. Most sequence clustering methods require a full distance matrix to be computed between all pairs of sequences. This requires memory and time proportional to N2 for N sequences.
View Article and Find Full Text PDFBMC Bioinformatics
December 2004
Background: Increasingly researchers are turning to the use of haplotype analysis as a tool in population studies, the investigation of linkage disequilibrium, and candidate gene analysis. When the phase of the data is unknown, computational methods, in particular those employing the Expectation-Maximisation (EM) algorithm, are frequently used for estimating the phase and frequency of the underlying haplotypes. These methods have proved very successful, predicting the phase-known frequencies from data for which the phase is unknown with a high degree of accuracy.
View Article and Find Full Text PDFBetween 1981 and 1984, 1063 patients (average age 32 years) were examined because of childlessness in marriage. The following parameters were evaluated in a first and second investigation: amount of ejaculate (ml), number of spermatozoa (mill./ml), motility (percentage of highly motile, motile, and non-motile spermatozoa immediately, 30, and 240 minutes after ejaculation), morphology (percentage of normally shaped spermatozoa), pH value, and fructose concentration (micrograms/ml), with special interest in the abstinence period.
View Article and Find Full Text PDF