Brief Bioinform
November 2022
Cell-type composition of intact bulk tissues can vary across samples. Deciphering cell-type composition and its changes during disease progression is an important step toward understanding disease pathogenesis. To infer cell-type composition, existing cell-type deconvolution methods for bulk RNA sequencing (RNA-seq) data often require matched single-cell RNA-seq (scRNA-seq) data, generated from samples with similar clinical conditions, as reference.
View Article and Find Full Text PDFVitamin A (VA) deficiency and diarrheal diseases are both serious public health issues worldwide. VA deficiency is associated with impaired intestinal barrier function and increased risk of mucosal infection-related mortality. The bioactive form of VA, retinoic acid, is a well-known regulator of mucosal integrity.
View Article and Find Full Text PDFAge-related macular degeneration (AMD) is a blinding eye disease with no unifying theme for its etiology. We used single-cell RNA sequencing to analyze the transcriptomes of ~ 93,000 cells from the macula and peripheral retina from two adult human donors and bulk RNA sequencing from fifteen adult human donors with and without AMD. Analysis of our single-cell data identified 267 cell-type-specific genes.
View Article and Find Full Text PDFVitamin A (VA) deficiency remains prevalent in resource limited areas. Using Citrobacter rodentium infection in mice as a model for diarrheal diseases, previous reports showed reduced pathogen clearance and survival due to vitamin A deficient (VAD) status. To characterize the impact of preexisting VA deficiency on gene expression patterns in the intestines, and to discover novel target genes in VA-related biological pathways, VA deficiency in mice were induced by diet.
View Article and Find Full Text PDFClustering and cell type classification are important steps in single-cell RNA-seq (scRNA-seq) analysis. As more and more scRNA-seq data are becoming available, supervised cell type classification methods that utilize external well-annotated source data start to gain popularity over unsupervised clustering algorithms. However, the performance of existing supervised methods is highly dependent on source data quality, and they often have limited accuracy to classify cell types that are missing in the source data.
View Article and Find Full Text PDFSingle-cell RNA sequencing (scRNA-seq) can characterize cell types and states through unsupervised clustering, but the ever increasing number of cells and batch effect impose computational challenges. We present DESC, an unsupervised deep embedding algorithm that clusters scRNA-seq data by iteratively optimizing a clustering objective function. Through iterative self-learning, DESC gradually removes batch effects, as long as technical differences across batches are smaller than true biological variations.
View Article and Find Full Text PDFCo-expression network analysis provides useful information for studying gene regulation in biological processes. Examining condition-specific patterns of co-expression can provide insights into the underlying cellular processes activated in a particular condition. One challenge in this type of analysis is that the sample sizes in each condition are usually small, making the statistical inference of co-expression patterns highly underpowered.
View Article and Find Full Text PDFThe identification of reproducible signals from the results of replicate high-throughput experiments is an important part of modern biological research. Often little is known about the dependence structure and the marginal distribution of the data, motivating the development of a nonparametric approach to assess reproducibility. The procedure, which we call the maximum rank reproducibility (MaRR) procedure, uses a maximum rank statistic to parse reproducible signals from noise without making assumptions about the distribution of reproducible signals.
View Article and Find Full Text PDFBMC Bioinformatics
January 2016
Background: Determining differentially expressed genes (DEGs) between biological samples is the key to understand how genotype gives rise to phenotype. RNA-seq and microarray are two main technologies for profiling gene expression levels. However, considerable discrepancy has been found between DEGs detected using the two technologies.
View Article and Find Full Text PDFAs an important mechanism for adaptation to heterogeneous environment, plastic responses of correlated traits to environmental alteration may also be genetically correlated, but less is known about the underlying genetic basis. We describe a statistical model for mapping specific quantitative trait loci (QTLs) that control the interrelationship of phenotypic plasticity between different traits. The model is constructed by a bivariate mixture setting, implemented with the EM algorithm to estimate the genetic effects of QTLs on correlative plastic response.
View Article and Find Full Text PDFAs a group of important plant species in agriculture and biology, polyploids have been increasingly studied in terms of their genome structure and organization. There are two types of polyploids, allopolyploids and autopolyploids, each resulting from a different genetic origin, which undergo meiotic divisions of a distinct complexity. A set of statistical models has been developed for linkage analysis, respectively for each type, by taking into account their unique meiotic behavior, i.
View Article and Find Full Text PDFAs a group of economically important species, linkage mapping of polysomic autotetraploids, including potato, sugarcane and rose, is difficult to conduct due to their unique meiotic property of double reduction that allows sister chromatids to enter into the same gamete. We describe and assess a statistical model for mapping quantitative trait loci (QTLs) in polysomic autotetraploids. The model incorporates double reduction, built in the mixture model-based framework and implemented with the expectation-maximization algorithm.
View Article and Find Full Text PDF