Background: Single-cell RNA sequencing (scRNA-seq) is now essential for cellular-level gene expression studies and deciphering complex gene regulatory mechanisms. Deep learning methods, when combined with scRNA-seq technology, transform gene regulation research into graph link prediction tasks. However, these methods struggle to mitigate the impact of noisy data in gene regulatory networks (GRNs) and address the significant imbalance between positive and negative links.
View Article and Find Full Text PDFMotivation: Circular RNAs (circRNAs) have been identified as key players in the progression of several diseases; however, their roles have not yet been determined because of the high financial burden of biological studies. This highlights the urgent need to develop efficient computational models that can predict circRNA-disease associations, offering an alternative approach to overcome the limitations of expensive experimental studies. Although multi-view learning methods have been widely adopted, most approaches fail to fully exploit the latent information across views, while simultaneously overlooking the fact that different views contribute to varying degrees of significance.
View Article and Find Full Text PDFIEEE J Biomed Health Inform
February 2025
Intrinsically disordered regions (IDRs) of proteins are crucial for a wide range of biological functions, with molecular recognition features (MoRFs) being of particular significance in protein interactions and cellular regulation. However, the identification of MoRFs has been a significant challenge in computational biology owing to their disorder-to-order transition properties. Currently, only a limited number of experimentally validated MoRFs are known, which has prompted the development of computational methods for predicting MoRFs from protein chains.
View Article and Find Full Text PDFIEEE Trans Neural Netw Learn Syst
December 2024
In this work, we propose MEDICO, a multiview deep generative model for molecule generation, structural optimization, and the SARS-CoV-2 inhibitor discovery. To the best of our knowledge, MEDICO is the first-of-this-kind graph generative model that can generate molecular graphs similar to the structure of targeted molecules, with a multiview representation learning framework to sufficiently and adaptively learn comprehensive structural semantics from targeted molecular topology and geometry. We show that our MEDICO significantly outperforms the state-of-the-art methods in generating valid, novel, and unique molecules under benchmarking comparisons, particularly achieving ˜85% improvement compared with the state-of-the-art methods in terms of validity.
View Article and Find Full Text PDFPrediction of drug-target interactions (DTIs) is one of the crucial steps for drug repositioning. Identifying DTIs through bio-experimental manners is always expensive and time-consuming. Recently, deep learning-based approaches have shown promising advancements in DTI prediction, but they face two notable challenges: (i) how to explicitly capture local interactions between drug-target pairs and learn their higher-order substructure embeddings; (ii) How to filter out redundant information to obtain effective embeddings for drugs and targets.
View Article and Find Full Text PDFIEEE Trans Med Imaging
November 2024
Medical question answering aims to enhance diagnostic support, improve patient education, and assist in clinical decision-making by automatically answering medicalrelated queries, which is an important foundation for realizing intelligent healthcare. Existing methods predominantly focus on extracting key information from a single data source, e.g.
View Article and Find Full Text PDFBrief Funct Genomics
January 2025
Deep learning models have made significant progress in the biomedical field, particularly in the prediction of drug-drug interactions (DDIs). DDIs are pharmacodynamic reactions between two or more drugs in the body, which may lead to adverse effects and are of great significance for drug development and clinical research. However, predicting DDI through traditional clinical trials and experiments is not only costly but also time-consuming.
View Article and Find Full Text PDFPairwise sequence alignment (PSA) serves as the cornerstone in computational bioinformatics, facilitating multiple sequence alignment and phylogenetic analysis. This paper introduces the FORAlign algorithm, leveraging the Four Russians algorithm with identical upper-bound time and space complexity as the Hirschberg divide-and-conquer PSA algorithm, aimed at accelerating Hirschberg PSA algorithm in parallel. Particularly notable is its capability to achieve up to 16.
View Article and Find Full Text PDFMotivation: Cancer affects millions globally, and as research advances, our understanding and treatment of cancer evolve. Compared to conventional treatments with significant side effects, anticancer peptides (ACPs) have gained considerable attention. Validating ACPs through wet-lab experiments is time-consuming and costly.
View Article and Find Full Text PDFAccurate identification of bitter peptides is essential for research. Although models using sequence information have evolved in the context of bitter peptides, there is still room for improvement in their predictive performance. In the present study, we introduced a novel predictive tool, iBitter-GRE, designed to improve the accuracy of bitter peptide identification.
View Article and Find Full Text PDFThe release of the first draft of the human pangenome has revolutionized genomic research by enabling access to complex regions like centromeres, composed of extra-long tandem repeats (ETRs). However, a significant gap remains as current methodologies are inadequate for producing sequence alignments that effectively capture genetic events within ETRs, highlighting a pressing need for improved alignment tools. Inspired by UniAligner, we develope Rare Match Aligner (RaMA), using rare matches as anchors and 2-piece affine gap cost to generate complete pairwise alignment that better capture genetic evolution.
View Article and Find Full Text PDFSubcellular localization is crucial for understanding the functions and regulatory mechanisms of biomolecules. Long non-coding RNAs (lncRNAs) have diverse roles in cellular processes, and their localization within specific subcellular compartments provides insights into their biological functions and implications in health and disease. The nucleolus and nucleoplasm are key hubs for RNA metabolism and cellular regulation.
View Article and Find Full Text PDFMicroorganisms
December 2024
Zokor is a group of subterranean rodents that are adapted to underground life and feed on plant roots. Here, we investigated the intestinal microbes of five zokor species (, , , , and ) using 16S amplicon technology combined with bioinformatics. Microbial composition analysis showed similar intestinal microbes but different proportions among five zokor species, and their dominant bacteria corresponded to those of herbivores.
View Article and Find Full Text PDFBrief Bioinform
November 2024
The identification of potential effective drug candidates is a fundamental step in new drug discovery, with profound implications for pharmaceutical research and the healthcare sector. While many computational methods have been developed for such predictions and have yielded promising results, two challenges persist: (i) The cold start problem of new drugs, which increases the difficulty of prediction due to lack of historical data or prior knowledge. (ii) The vastness of the compound search space for potential drug candidates.
View Article and Find Full Text PDFClathrin proteins, key elements of the vesicle coat, play a crucial role in various cellular processes, including neural function, signal transduction, and endocytosis. Disruptions in clathrin protein functions have been associated with a wide range of diseases, such as Alzheimer's, neurodegeneration, viral infection, and cancer. Therefore, correctly identifying clathrin protein functions is critical to unravel the mechanism of these fatal diseases and designing drug targets.
View Article and Find Full Text PDFThe emergence of the "Protein Corona" is a pivotal concept in bioinformatics and nanotechnology, crucial for understanding nanomedicine delivery and nanoparticle-biological entity interactions. After entering a biological fluid, such as blood, nanoparticles (NPs, such as nanomedical carriers) are quickly coated with proteins, forming a protein interface layer called the protein corona. An in-depth investigation into the protein corona is essential for elucidating the biological ramifications of NPs and their prospective applications within the medical field and beyond.
View Article and Find Full Text PDFMotivation: Accurately predicting the degradation capabilities of proteolysis-targeting chimeras (PROTACs) for given target proteins and E3 ligases is important for PROTAC design. The distinctive ternary structure of PROTACs presents a challenge to traditional drug-target interaction prediction methods, necessitating more innovative approaches. While current state-of-the-art (SOTA) methods using graph neural networks (GNNs) can discern the molecular structure of PROTACs and proteins, thus enabling the efficient prediction of PROTACs' degradation capabilities, they rely heavily on limited crystal structure data of the POI-PROTAC-E3 ternary complex.
View Article and Find Full Text PDFBrief Bioinform
November 2024
The burgeoning accumulation of large-scale biomedical data in oncology, alongside significant strides in deep learning (DL) technologies, has established multimodal DL (MDL) as a cornerstone of precision oncology. This review provides an overview of MDL applications in this field, based on an extensive literature survey. In total, 651 articles published before September 2024 are included.
View Article and Find Full Text PDFThe study of plant genomics has significantly deepened our understanding of plant evolution and adaptation from a microscopic perspective [...
View Article and Find Full Text PDFAdvancements in spatial transcriptomics (ST) technology have enabled the analysis of gene expression while preserving cellular spatial information, greatly enhancing our understanding of cellular interactions within tissues. Accurate identification of spatial domains is crucial for comprehending tissue organization. However, the effective integration of spatial location and gene expression still faces significant challenges.
View Article and Find Full Text PDFEnsuring accurate multiple sequence alignment (MSA) is essential for comprehensive biological sequence analysis. However, the complexity of evolutionary relationships often results in variations that generic alignment tools may not adequately address. Realignment is crucial to remedy this issue.
View Article and Find Full Text PDFSingle-cell multi-omics refers to the various types of biological data at the single-cell level. These data have enabled insight and resolution to cellular phenotypes, biological processes, and developmental stages. Current advances hold high potential for breakthroughs by integrating multiple different omics layers.
View Article and Find Full Text PDFInflammation is a biological response to harmful stimuli, playing a crucial role in facilitating tissue repair by eradicating pathogenic microorganisms. However, when inflammation becomes chronic, it leads to numerous serious disorders, particularly in autoimmune diseases. Anti-inflammatory peptides (AIPs) have emerged as promising therapeutic agents due to their high specificity, potency, and low toxicity.
View Article and Find Full Text PDF