Bioinformatics
December 2024
Motivation: Isoforms spliced from the same gene may carry distinct biological functions. Therefore, annotating functions at the isoform level provides valuable insights into the functional diversity of genomes. Since experimental approaches for determining isoform functions are time- and cost-demanding, computational methods have been proposed.
View Article and Find Full Text PDFAlzheimer's disease (AD) is a complex disease with its genetic etiology not fully understood. Gene network-based methods have been proven promising in predicting AD genes. However, existing approaches are limited in their ability to model the nonlinear relationship between networks and disease genes, because (i) any data can be theoretically decomposed into the sum of a linear part and a nonlinear part, (ii) the linear part can be best modeled by a linear model since a nonlinear model is biased and can be easily overfit, and (iii) existing methods do not separate the linear part from the nonlinear part when building the disease gene prediction model.
View Article and Find Full Text PDFSingle-cell RNA sequencing (scRNA-seq) technologies have become essential tools for characterizing cellular landscapes within complex tissues. Large-scale single-cell transcriptomics holds great potential for identifying rare cell types critical to the pathogenesis of diseases and biological processes. Existing methods for identifying rare cell types often rely on one-time clustering using partial or global gene expression.
View Article and Find Full Text PDFMotivation: Identifying cancer genes remains a significant challenge in cancer genomics research. Annotated gene sets encode functional associations among multiple genes, and cancer genes have been shown to cluster in hallmark signaling pathways and biological processes. The knowledge of annotated gene sets is critical for discovering cancer genes but remains to be fully exploited.
View Article and Find Full Text PDFObjectives: Idiopathic inflammatory myopathies (IIMs) are a group of heterogeneous autoimmune diseases. Intron retention (IR) serves as an important post-transcriptional and translational regulatory mechanism. This study aims to identify changes in IR profiles in IIM subtypes, investigating their influence on proteins and their correlations with clinical features.
View Article and Find Full Text PDFCurr Opin Struct Biol
February 2024
Drug response prediction is essential for drug development and disease treatment. One key question in predicting drug response is the representation of molecules, which has been greatly advanced by artificial intelligence (AI) techniques in recent years. In this review, we first describe different types of representation methods, pinpointing their key principles and discussing their limitations.
View Article and Find Full Text PDFIEEE/ACM Trans Comput Biol Bioinform
February 2024
Graph learning methods have achieved noteworthy performance in disease diagnosis due to their ability to represent unstructured information such as inter-subject relationships. While it has been shown that imaging, genetic and clinical data are crucial for degenerative disease diagnosis, existing methods rarely consider how best to use their relationships. How best to utilize information from imaging, genetic and clinical data remains a challenging problem.
View Article and Find Full Text PDFSingle-cell clustering is a critical step in biological downstream analysis. The clustering performance could be effectively improved by extracting cell-type-specific genes. The state-of-the-art feature selection methods usually calculate the importance of a single gene without considering the information contained in the gene expression distribution.
View Article and Find Full Text PDFMotivation: A single gene may yield several isoforms with different functions through alternative splicing. Continuous efforts are devoted to developing machine-learning methods to predict isoform functions. However, existing methods do not consider the relevance of each feature to specific functions and ignore the noise caused by the irrelevant features.
View Article and Find Full Text PDFThe main purpose of this study was to reveal the nutritional value and antioxidant activity of 34 edible flowers that grew in Yunnan Province, China, through a comprehensive assessment of their nutritional composition and antioxidant indices. The results showed that sample A3 of Asteraceae flowers had the highest total flavonoid content, with a value of 8.53%, and the maximum contents of vitamin C and reducing sugars were from Rosaceae sample R1 and Gentianaceae sample G3, with values of 143.
View Article and Find Full Text PDFMotivation: Single-cell RNA sequencing (scRNA-seq) offers a powerful tool to dissect the complexity of biological tissues through cell sub-population identification in combination with clustering approaches. Feature selection is a critical step for improving the accuracy and interpretability of single-cell clustering. Existing feature selection methods underutilize the discriminatory potential of genes across distinct cell types.
View Article and Find Full Text PDFAlternative splicing (AS) is a key transcriptional regulation pathway. Recent studies have shown that AS events are associated with the occurrence of complex diseases. Various computational approaches have been developed for the detection of disease-associated AS events.
View Article and Find Full Text PDFIEEE/ACM Trans Comput Biol Bioinform
December 2023
It is important to identify disease-associated genes for studying the pathogenic mechanism of complex diseases. Recently, models for disease gene prediction are dominantly based on molecular expression data and networks, including gene expression, protein expression, co-expression networks, protein-protein interaction networks, etc. One limitation of these methods is that they do not consider the knowledge of annotated gene sets representing known pathways or functionally-related sets of genes.
View Article and Find Full Text PDFBioinformatics
October 2022
Motivation: Gene-centric bioinformatics studies frequently involve the calculation or the extraction of various features of genes such as splice sites, promoters, independent introns and untranslated regions (UTRs) through manipulation of gene models. Gene models are often annotated in gene transfer format (GTF) files. The features are essential for subsequent analysis such as intron retention detection, DNA-binding site identification and computing splicing strength of splice sites.
View Article and Find Full Text PDFIEEE/ACM Trans Comput Biol Bioinform
April 2023
Topologically associating domains (TADs) are local chromatin interaction domains, which have been shown to play an important role in gene expression regulation. TADs were originally discovered in the investigation of 3D genome organization based on High-throughput Chromosome Conformation Capture (Hi-C) data. Continuous considerable efforts have been dedicated to developing methods for detecting TADs from Hi-C data.
View Article and Find Full Text PDFIEEE/ACM Trans Comput Biol Bioinform
April 2023
Single cell RNA sequencing (scRNA-seq) provides a powerful approach for profiling transcriptomes at single cell resolution. An essential application of scRNA-seq is the discovery of cell types with the aid of clustering analysis. Currently, existing single cell clustering methods are exclusively based on gene-level expression data, without considering alternative splicing information.
View Article and Find Full Text PDFMotivation: Alzheimer's disease (AD) is a complex brain disorder with risk genes incompletely identified. The candidate genes are dominantly obtained by computational approaches. In order to obtain biological insights of candidate genes or screen genes for experimental testing, it is essential to assess their relevance to AD.
View Article and Find Full Text PDFAlzheimer's disease (AD) has a strong genetic predisposition. However, its risk genes remain incompletely identified. We developed an Alzheimer's brain gene network-based approach to predict AD-associated genes by leveraging the functional pattern of known AD-associated genes.
View Article and Find Full Text PDFMapping gene interactions within tissues/cell types plays a crucial role in understanding the genetic basis of human physiology and disease. Tissue functional gene networks (FGNs) are essential models for mapping complex gene interactions. We present TissueNexus, a database of 49 human tissue/cell line FGNs constructed by integrating heterogeneous genomic data.
View Article and Find Full Text PDFIn recent years, the application of single cell RNA-seq (scRNA-seq) has become more and more popular in fields such as biology and medical research. Analyzing scRNA-seq data can discover complex cell populations and infer single-cell trajectories in cell development. Clustering is one of the most important methods to analyze scRNA-seq data.
View Article and Find Full Text PDFIn single-cell RNA-seq (scRNA-seq) data analysis, a fundamental problem is to determine the number of cell clusters based on the gene expression profiles. However, the performance of current methods is still far from satisfactory, presumably due to their limitations in capturing the expression variability among cell clusters. Batch effects represent the undesired variability between data measured in different batches.
View Article and Find Full Text PDFIntroduction: Passive immunotherapies targeting Aβ continue to be evaluated as Alzheimer's disease (AD) therapeutics, but there remains debate over the mechanisms by which these immunotherapies work. Besides the amount of preexisting Aβ deposition and the type of deposit (compact or diffuse), there is little data concerning what factors, independent of those intrinsic to the antibody, might influence efficacy. Here we (i) explored how constitutive priming of the underlying innate activation states by Il10 and Il6 might influence passive Aβ immunotherapy and (ii) evaluated transcriptomic data generated in the AMP-AD initiative to inform how these two cytokines and their receptors' mRNA levels are altered in human AD and an APP mouse model.
View Article and Find Full Text PDFIntron retention (IR) has been implicated in the pathogenesis of complex diseases such as cancers; its association with Alzheimer's disease (AD) remains unexplored. We performed genome-wide analysis of IR through integrating genetic, transcriptomic, and proteomic data of AD subjects and mouse models from the Accelerating Medicines Partnership-Alzheimer's Disease project. We identified 4535 and 4086 IR events in 2173 human and 1736 mouse genes, respectively.
View Article and Find Full Text PDFClustering is an efficient way to analyze single-cell RNA sequencing data. It is commonly used to identify cell types, which can help in understanding cell differentiation processes. However, different clustering results can be obtained from different single-cell clustering methods, sometimes including conflicting conclusions, and biologists will often fail to get the right clustering results and interpret the biological significance.
View Article and Find Full Text PDFAlternative splicing (AS) leads to transcriptome diversity in eukaryotic cells and is one of the key regulators driving cellular differentiation. Although AS is of crucial importance for normal hematopoiesis and hematopoietic malignancies, its role in early hematopoietic development is still largely unknown. Here, by using high-throughput transcriptomic analyses, we show that pervasive and dynamic AS takes place during hematopoietic development of human pluripotent stem cells (hPSCs).
View Article and Find Full Text PDF