Cancer tissue-of-origin specific biomarkers are needed for effective diagnosis, monitoring, and treatment of cancers. In this study, we analyzed transcriptomics data from 37 cancer types provided by The Cancer Genome Atlas (TCGA) to identify cancer tissue-of-origin specific gene expression signatures. We developed a deep neural network model to classify cancers based on gene expression data.
View Article and Find Full Text PDFBackground: While blood transfusion is an essential cornerstone of hematological care, patients requiring repetitive transfusion remain at persistent risk of alloimmunization due to the diversity of human blood group polymorphisms. Despite the promise, user friendly methods to accurately identify blood types from next-generation sequencing data are currently lacking. To address this unmet need, we have developed RBCeq, a novel genetic blood typing algorithm to accurately identify 36 blood group systems.
View Article and Find Full Text PDFAssay for Transposase Accessible Chromatin with high-throughput sequencing (ATAC-seq) is a powerful genomic technology that is used for the global mapping and analysis of open chromatin regions. However, for users to process and analyze such data they either have to use a number of complicated bioinformatic tools or attempt to use the currently available ATAC-seq analysis software, which are not very user friendly and lack visualization of the ATAC-seq results. Because of these issues, biologists with minimal bioinformatics background who wish to process and analyze their own ATAC-seq data by themselves will find these tasks difficult and ultimately will need to seek help from bioinformatics experts.
View Article and Find Full Text PDFThe identification and functional characterization of novel biomarkers in cancer requires survival analysis and gene expression analysis of both patient samples and cell line models. To help facilitate this process, we have developed KM-Express. KM-Express holds an extensive manually curated transcriptomic profile of 45 different datasets for prostate and breast cancer with phenotype and pathoclinical information, spanning from clinical samples to cell lines.
View Article and Find Full Text PDFAndrographis paniculata is an important medicinal plant containing various bioactive terpenoids and flavonoids. Despite its importance in herbal medicine, no ready-to-use transcript sequence information of this plant is made available in the public data base, this study mainly deals with the sequencing of RNA from A. paniculata leaf using Illumina HiSeq™ 2000 platform followed by the de novo transcriptome assembly.
View Article and Find Full Text PDFBackground: Developing drought-tolerant rice varieties with higher yield under water stressed conditions provides a viable solution to serious yield-reduction impact of drought. Understanding the molecular regulation of this polygenic trait is crucial for the eventual success of rice molecular breeding programmes. microRNAs have received tremendous attention recently due to its importance in negative regulation.
View Article and Find Full Text PDF