Introduction: Japan is rapidly becoming a full-fledged aged society, and physician shortage is a significant concern. The Japanese government has increased the number of medical school enrollments since 2008, but some researchers warn that this increase could lead to physician surplus in the future. It is unknown how many physicians will be required to accommodate future healthcare needs.
View Article and Find Full Text PDFPurpose: To identify stage I lung adenocarcinoma patients with a poor prognosis who will benefit from adjuvant therapy.
Patients And Methods: Whole gene expression profiles were obtained at 19 time points over a 48-hour time course from human primary lung epithelial cells that were stimulated with epidermal growth factor (EGF) in the presence or absence of a clinically used EGF receptor tyrosine kinase (RTK)-specific inhibitor, gefitinib. The data were subjected to a mathematical simulation using the State Space Model (SSM).
Background: The oncogenic single nucleotide polymorphism rs6983267, located on 8q24.21, may affect copy number aberrations and/or expression profiles in colorectal cancer (CRC). We investigated the role of this single nucleotide polymorphism in the clinical outcome of CRC.
View Article and Find Full Text PDFA number of neurological diseases are caused by mutations of RNA metabolism-related genes. A complicating issue is that whether under- or overfunction of such genes is responsible for the phenotype. Polyglutamine tract binding protein-1, a causative gene for X-linked mental retardation, is also involved in RNA metabolism, and both mutation and duplication of the gene were reported in human patients.
View Article and Find Full Text PDFJ Bioinform Comput Biol
October 2012
Feature selection algorithms play a crucial role in identifying and discovering important genes for cancer classification. Feature selection algorithms can be broadly categorized into two main groups: filter-based methods and wrapper-based methods. Filter-based methods have been quite popular in the literature due to their many advantages, including computational efficiency, simplistic architecture, and an intuitively simple means of discovering biological and clinical aspects.
View Article and Find Full Text PDFBackground: Epidemiological studies have suggested that the encounter with commensal microorganisms during the neonatal period is essential for normal development of the host immune system. Basic research involving gnotobiotic mice has demonstrated that colonization at the age of 5 weeks is too late to reconstitute normal immune function. In this study, we examined the transcriptome profiles of the large intestine (LI), small intestine (SI), liver (LIV), and spleen (SPL) of 3 bacterial colonization models-specific pathogen-free mice (SPF), ex-germ-free mice with bacterial reconstitution at the time of delivery (0WexGF), and ex-germ-free mice with bacterial reconstitution at 5 weeks of age (5WexGF)-and compared them with those of germ-free (GF) mice.
View Article and Find Full Text PDFMotivation: In cancer genomes, chromosomal regions harboring cancer genes are often subjected to genomic aberrations like copy number alteration and loss of heterozygosity. Given this, finding recurrent genomic aberrations is considered an apt approach for screening cancer genes. Although several permutation-based tests have been proposed for this purpose, none of them are designed to find recurrent aberrations from the genomic dataset without paired normal sample controls.
View Article and Find Full Text PDFBackground: Our understanding of the molecular pathways that underlie melanoma remains incomplete. Although several published microarray studies of clinical melanomas have provided valuable information, we found only limited concordance between these studies. Therefore, we took an in vitro functional genomics approach to understand melanoma molecular pathways.
View Article and Find Full Text PDFBackground: Colorectal cancer (CRC) oncogenesis was considered to be determined by interactions between genetic and environmental factors. Specific interacting factors that influence CRC morbidity have yet to be fully investigated.
Methods: A multi-institutional collaborative study with 1511 CRC patients and 2098 control subjects was used to compare the odds ratios for the occurrence of polymorphisms at 11 known single nucleotide polymorphisms (SNPs).
IEEE/ACM Trans Comput Biol Bioinform
October 2012
We propose a statistical method for uncovering gene pathways that characterize cancer heterogeneity. To incorporate knowledge of the pathways into the model, we define a set of activities of pathways from microarray gene expression data based on the Sparse Probabilistic Principal Component Analysis (SPPCA). A pathway activity logistic regression model is then formulated for cancer phenotype.
View Article and Find Full Text PDFBackground: In the analysis of effects by cell treatment such as drug dosing, identifying changes on gene network structures between normal and treated cells is a key task. A possible way for identifying the changes is to compare structures of networks estimated from data on normal and treated cells separately. However, this approach usually fails to estimate accurate gene networks due to the limited length of time series data and measurement noise.
View Article and Find Full Text PDFOur research group is currently developing software for estimating large-scale gene networks from gene expression data. The software, called SiGN, is specifically designed for the Japanese flagship supercomputer "K computer" which is planned to achieve 10 petaflops in 2012, and other high performance computing environments including Human Genome Center (HGC) supercomputer system. SiGN is a collection of gene network estimation software with three different sub-programs: SiGN-BN, SiGN-SSM and SiGN-L1.
View Article and Find Full Text PDFGene regulatory networks inferred from RNA abundance data have generated significant interest, but despite this, gene network approaches are used infrequently and often require input from bioinformaticians. We have assembled a suite of tools for analysing regulatory networks, and we illustrate their use with microarray datasets generated in human endothelial cells. We infer a range of regulatory networks, and based on this analysis discuss the strengths and limitations of network inference from RNA abundance data.
View Article and Find Full Text PDFIEEE/ACM Trans Comput Biol Bioinform
September 2012
Most of the conventional feature selection algorithms have a drawback whereby a weakly ranked gene that could perform well in terms of classification accuracy with an appropriate subset of genes will be left out of the selection. Considering this shortcoming, we propose a feature selection algorithm in gene expression data analysis of sample classifications. The proposed algorithm first divides genes into subsets, the sizes of which are relatively small (roughly of size h), then selects informative smaller subsets of genes (of size r < h) from a subset and merges the chosen genes with another gene subset (of size r) to update the gene subset.
View Article and Find Full Text PDFNon-small cell lung cancer (NSCLC) is a major subtype of lung cancer and is the most common and fatal cancer worldwide. Specific tyrosine kinase inhibitors for epidermal growth factor receptor (EGFR), such as gefitinib, have been effective in some NSCLC patients and are being used in the clinical setting as pioneer molecularly targeted cancer drugs. However, many patients have not responded to these drugs, and have acquired resistance after long-term treatment.
View Article and Find Full Text PDFThe functional impact of recently discovered long noncoding RNAs (ncRNAs) in human cancer remains to be clarified. One long ncRNA which has attracted attention is the Hox transcript antisense intergenic RNA termed HOTAIR, a long ncRNA expressed from the developmental HOXC locus located on chromosome 12q13.13.
View Article and Find Full Text PDFMany statistical methods have been proposed to estimate causal models in classical situations with fewer variables than observations. However, modern datasets including gene expression data increase the needs of high-dimensional causal modeling in challenging situations with orders of magnitude more variables than observations. In this paper, we propose a method to find exogenous variables in a linear non-Gaussian causal model, which requires much smaller sample sizes than conventional methods and works even under orders of magnitude more variables than observations.
View Article and Find Full Text PDFPatient-specific analysis of molecular networks is a promising strategy for making individual risk predictions and treatment decisions in cancer therapy. Although systems biology allows the gene network of a cell to be reconstructed from clinical gene expression data, traditional methods, such as bayesian networks, only provide an averaged network for all samples. Therefore, these methods cannot reveal patient-specific differences in molecular networks during cancer progression.
View Article and Find Full Text PDFUnlabelled: SiGN-SSM is an open-source gene network estimation software able to run in parallel on PCs and massively parallel supercomputers. The software estimates a state space model (SSM), that is a statistical dynamic model suitable for analyzing short time and/or replicated time series gene expression profiles. SiGN-SSM implements a novel parameter constraint effective to stabilize the estimated models.
View Article and Find Full Text PDFMotivation: A number of unsupervised gene set screening methods have recently been developed for search of putative functional gene sets based on their expression profiles. Most of the methods statistically evaluate whether the expression profiles of each gene set are fit to assumed models: e.g.
View Article and Find Full Text PDFIEEE/ACM Trans Comput Biol Bioinform
August 2011
We present a novel algorithm to estimate genome-wide gene networks consisting of more than 20,000 genes from gene expression data using nonparametric Bayesian networks. Due to the difficulty of learning Bayesian network structures, existing algorithms cannot be applied to more than a few thousand genes. Our algorithm overcomes this limitation by repeatedly estimating subnetworks in parallel for genes selected by neighbor node sampling.
View Article and Find Full Text PDFOne of the open problems in systems biology is to infer dynamic gene networks describing the underlying biological process with mathematical, statistical and computational methods. The first-order difference equation-based models such as dynamic Bayesian networks and vector autoregressive models were used to infer time-lagged relationships between genes from time-series microarray data. However, two primary problems greatly reduce the effectiveness of current approaches.
View Article and Find Full Text PDFWe propose a statistical model realizing simultaneous estimation of gene regulatory network and gene module identification from time series gene expression data from microarray experiments. Under the assumption that genes in the same module are densely connected, the proposed method detects gene modules based on the variational Bayesian technique. The model can also incorporate existing biological prior knowledge such as protein subcellular localization.
View Article and Find Full Text PDFSeveral technologies are currently used for gene expression profiling, such as Real Time RT-PCR, microarray and CAGE (Cap Analysis of Gene Expression). CAGE is a recently developed method for constructing transcriptome maps and it has been successfully applied to analyzing gene expressions in diverse biological studies. The principle of CAGE has been developed to address specific issues such as determination of transcriptional starting sites, the study of promoter regions and identification of new transcripts.
View Article and Find Full Text PDFDecoding transcriptional programs governing transcriptomic diversity across human multiple tissues is a major challenge in bioinformatics. To address this problem, a number of computational methods have focused on cis-regulatory codes driving overexpression or underexpression in a single tissue as compared to others. On the other hand, we recently proposed a different approach to mine cis-regulatory codes: starting from gene sets sharing common cis-regulatory motifs, the method screens for expression modules based on expression coherence.
View Article and Find Full Text PDF