Motivation: RNA-Seq experiments have revealed a multitude of novel ncRNAs. The gold standard for their analysis based on simultaneous alignment and folding suffers from extreme time complexity of [Formula: see text]. Subsequently, numerous faster 'Sankoff-style' approaches have been suggested. Commonly, the performance of such methods relies on sequence-based heuristics that restrict the search space to optimal or near-optimal sequence alignments; however, the accuracy of sequence-based methods breaks down for RNAs with sequence identities below 60%. Alignment approaches like LocARNA that do not require sequence-based heuristics, have been limited to high complexity ([Formula: see text] quartic time).
Results: Breaking this barrier, we introduce the novel Sankoff-style algorithm 'sparsified prediction and alignment of RNAs based on their structure ensembles (SPARSE)', which runs in quadratic time without sequence-based heuristics. To achieve this low complexity, on par with sequence alignment algorithms, SPARSE features strong sparsification based on structural properties of the RNA ensembles. Following PMcomp, SPARSE gains further speed-up from lightweight energy computation. Although all existing lightweight Sankoff-style methods restrict Sankoff's original model by disallowing loop deletions and insertions, SPARSE transfers the Sankoff algorithm to the lightweight energy model completely for the first time. Compared with LocARNA, SPARSE achieves similar alignment and better folding quality in significantly less time (speedup: 3.7). At similar run-time, it aligns low sequence identity instances substantially more accurate than RAF, which uses sequence-based heuristics.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4514930 | PMC |
http://dx.doi.org/10.1093/bioinformatics/btv185 | DOI Listing |
Sci Rep
June 2024
Department of Bioinformatics and Medical Engineering, Asia University, Taichung, 41354, Taiwan.
Angiogenesis is a key process for the proliferation and metastatic spread of cancer cells. Anti-angiogenic peptides (AAPs), with the capability of inhibiting angiogenesis, are promising candidates in cancer treatment. We propose AAPL, a sequence-based predictor to identify AAPs with machine learning models of improved prediction accuracy.
View Article and Find Full Text PDFInsect Mol Biol
August 2023
University of Hamburg, Hamburg, Germany.
Mitochondrial gene order has contributed to the elucidation of evolutionary relationships in several animal groups. It generally has found its application as a phylogenetic marker for deep nodes. Yet, in Orthoptera limited research has been performed on the gene order, although the group represents one of the oldest insect orders.
View Article and Find Full Text PDFResearch (Wash D C)
October 2022
Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China.
DNA replication initiation is a complex process involving various genetic and epigenomic signatures. The correct identification of replication origins (ORIs) could provide important clues for the study of a variety of diseases caused by replication. Here, we design a computational approach named iORI-Epi to recognize ORIs by incorporating epigenome-based features, sequence-based features, and 3D genome-based features.
View Article and Find Full Text PDFProtein Pept Lett
June 2022
Biomedical Innovations Research for Translational Health Science (BIRTHS) Laboratory, Department of Biochemistry and Molecular Biology, College of Medicine, University of the Philippines Manila, Manila, Philippines.
Background: B-cell epitope prediction for antipeptide antibody responses enables peptide-based vaccine design and related translational applications. This entails estimating epitopeparatope binding free-energy changes from antigen sequence; but attempts to do so assuming uniform epitope length (e.g.
View Article and Find Full Text PDFTraffic
September 2021
Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, California, USA.
The correct targeting and insertion of tail-anchored (TA) integral membrane proteins is critical for cellular homeostasis. TA proteins are defined by a hydrophobic transmembrane domain (TMD) at their C-terminus and are targeted to either the ER or mitochondria. Derived from experimental measurements of a few TA proteins, there has been little examination of the TMD features that determine localization.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!