Sequence similarity search is a fundamental way of analyzing nucleotide sequences. Despite decades of research, this is not a solved problem because there exist many similarities that are not found by current methods. Search methods are typically based on a seed-and-extend approach, which has many variants (e.g. spaced seeds, transition seeds), and it remains unclear how to optimize this approach. This study designs and tests seeding methods for inter-mammal and inter-insect genome comparison. By considering substitution patterns of real genomes, we design sets of multiple complementary transition seeds, which have better performance (sensitivity per run time) than previous seeding strategies. Often the best seed patterns have more transition positions than those used previously. We also point out that recent computer memory sizes (e.g. 60 GB) make it feasible to use multiple (e.g. eight) seeds for whole mammal genomes. Interestingly, the most sensitive settings achieve diminishing returns for human-dog and melanogaster-pseudoobscura comparisons, but not for human-mouse, which suggests that we still miss many human-mouse alignments. Our optimized heuristics find ∼20,000 new human-mouse alignments that are missing from the standard UCSC alignments. We tabulate seed patterns and parameters that work well so they can be used in future research.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3985675 | PMC |
http://dx.doi.org/10.1093/nar/gku104 | DOI Listing |
Constraints
November 2024
Polytechnique Montréal, Montreal, Canada.
Constraint programming is known for being an efficient approach to solving combinatorial problems. Important design choices in a solver are the , designed to lead the search to the best solutions in a minimum amount of time. However, developing these heuristics is a time-consuming process that requires problem-specific expertise.
View Article and Find Full Text PDFSensors (Basel)
December 2024
School of Mechanical Electrical and Information Engineering, Shandong University, Weihai 264209, China.
Feature selection (FS) is a key process in many pattern-recognition tasks, which reduces dimensionality by eliminating redundant or irrelevant features. However, for complex high-dimensional issues, traditional FS methods cannot find the ideal feature combination. To overcome this disadvantage, this paper presents a multispiral whale optimization algorithm (MSWOA) for feature selection.
View Article and Find Full Text PDFMedicina (Kaunas)
December 2024
Cooperative Studies Program Coordinating Center, VA Boston, Lafayette City Center, 2 Avenue de Lafayette, Boston, MA 02111, USA.
This work represents a significant contribution to understanding the importance of appropriately rounding numbers with minimal error. That is, to reduce inexact rounding and data truncation error and simultaneously eliminate unintentional misleading findings in epidemiological studies. The rounding of numbers represents a compromise solution that attempts to find a balance between the loss of information from reporting too few significant digits versus retaining more digits than necessary.
View Article and Find Full Text PDFThe task of RNA design given a target structure aims to find a sequence that can fold into that structure. It is a computationally hard problem where some version(s) have been proven to be NP-hard. As a result, heuristic methods such as local search have been popular for this task, but by only exploring a fixed number of candidates.
View Article and Find Full Text PDFPhys Rev E
November 2024
Department of Mathematics, University of Utah, Salt Lake City, Utah 84112, USA.
The imperfect narrow escape problem considers the mean first passage time (MFPT) of a Brownian particle through one of several small, partially reactive traps on an otherwise reflecting boundary within a confining domain. Mathematically, this problem is equivalent to Poisson's equation with mixed Neumann-Robin boundary conditions. Here, we obtain this MFPT in general three-dimensional domains by using strong localized perturbation theory in the small trap limit.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!