Publications by authors named "Chikatoshi Kai"

Background: Although transcription in mammalian genomes can initiate from various genomic positions (e.g., 3'UTR, coding exons, etc.

View Article and Find Full Text PDF

Combinatorial interactions among transcription factors are critical to directing tissue-specific gene expression. To build a global atlas of these combinations, we have screened for physical interactions among the majority of human and mouse DNA-binding transcription factors (TFs). The complete networks contain 762 human and 877 mouse interactions.

View Article and Find Full Text PDF

Background: Wheat is an allopolyploid plant that harbors a huge, complex genome. Therefore, accumulation of expressed sequence tags (ESTs) for wheat is becoming particularly important for functional genomics and molecular breeding. We prepared a comprehensive collection of ESTs from the various tissues that develop during the wheat life cycle and from tissues subjected to stress.

View Article and Find Full Text PDF

Using deep sequencing (deepCAGE), the FANTOM4 study measured the genome-wide dynamics of transcription-start-site usage in the human monocytic cell line THP-1 throughout a time course of growth arrest and differentiation. Modeling the expression dynamics in terms of predicted cis-regulatory sites, we identified the key transcription regulators, their time-dependent activities and target genes. Systematic siRNA knockdown of 52 transcription factors confirmed the roles of individual factors in the regulatory network.

View Article and Find Full Text PDF

Background: Small RNA attracts increasing interest based on the discovery of RNA silencing and the rapid progress of our understanding of these phenomena. Although recent studies suggest the possible existence of yet undiscovered types of small RNAs in higher organisms, many studies to profile small RNA have focused on miRNA and/or siRNA rather than on the exploration of additional classes of RNAs.

Results: Here, we explored human small RNAs by unbiased sequencing of RNAs with sizes of 19-40 nt.

View Article and Find Full Text PDF

Background: The nucleus is a complex cellular organelle and accurately defining its protein content is essential before any systematic characterization can be considered.

Results: We report direct evidence for 2,568 mammalian proteins within the nuclear proteome: the nuclear subcellular localization of 1,529 proteins based on a high-throughput subcellular localization protocol of full-length proteins and an additional 1,039 proteins for which clear experimental evidence is documented in published literature. This is direct evidence that the nuclear proteome consists of at least 14% of the entire proteome.

View Article and Find Full Text PDF

Many genes are arranged in complex overlapping and interlaced patterns in eukaryotic genomes. It is unclear whether or how such genes can avoid interference from each other's RNA processing signals and retain distinct identities. This puzzle applies particularly to 3' end formation sites, which inherently terminate the transcript, and thus act as boundaries between adjacent genes.

View Article and Find Full Text PDF
Article Synopsis
  • - The study reports on experiments analyzing a targeted 1% of the human genome during the ENCODE Project's pilot phase, providing crucial insights into human genome function.
  • - Findings reveal that the human genome is largely transcribed, with evidence showing that most genomic bases contribute to various types of transcripts, including those that do not code for proteins.
  • - Enhanced understanding of transcription regulation, chromatin structure, and evolutionary insights from comparisons between species help define the functional landscape of the human genome, guiding future research in genome characterization.
View Article and Find Full Text PDF

The survival of motor neuron (SMN) protein, responsible for the neurodegenerative disease spinal muscular atrophy (SMA), oligomerizes and forms a stable complex with seven other major components, the Gemin proteins. Besides the SMN protein, Gemin2 is a core protein that is essential for the formation of the SMN complex, although the mechanism by which it drives formation is unclear. We have found a novel interaction, a Gemin2 self-association, using the mammalian two-hybrid system and the in vitro pull-down assays.

View Article and Find Full Text PDF

Background: Mammalian antimicrobial peptides (AMPs) are effectors of the innate immune response. A multitude of signals coming from pathways of mammalian pathogen/pattern recognition receptors and other proteins affect the expression of AMP-coding genes (AMPcgs). For many AMPcgs the promoter elements and transcription factors that control their tissue cell-specific expression have yet to be fully identified and characterized.

View Article and Find Full Text PDF

Background: Mammalian promoters do not initiate transcription at single, well defined base pairs, but rather at multiple, alternative start sites spread across a region. We previously characterized the static structures of transcription start site usage within promoters at the base pair level, based on large-scale sequencing of transcript 5' ends.

Results: In the present study we begin to explore the internal dynamics of mammalian promoters, and demonstrate that start site selection within many mouse core promoters varies among tissues.

View Article and Find Full Text PDF

Several recent studies indicate that mammals and other organisms produce large numbers of RNA transcripts that do not correspond to known genes. It has been suggested that these transcripts do not encode proteins, but may instead function as RNAs. However, discrimination of coding and non-coding transcripts is not straightforward, and different laboratories have used different methods, whose ability to perform this discrimination is unclear.

View Article and Find Full Text PDF

Hair cells express a complement of ion channels, representing shared and distinct channels that confer distinct electrophysiological signatures for each cell. This diversity is generated by the use of alternative splicing in the alpha subunit, formation of heterotetrameric channels, and combinatorial association with beta subunits. These channels are thought to play a role in the tonotopic gradient observed in the mammalian cochlea.

View Article and Find Full Text PDF

Background: The TATA box, one of the most well studied core promoter elements, is associated with induced, context-specific expression. The lack of precise transcription start site (TSS) locations linked with expression information has impeded genome-wide characterization of the interaction between TATA and the pre-initiation complex.

Results: Using a comprehensive set of 5.

View Article and Find Full Text PDF

Alignments of homologous genomic sequences are widely used to identify functional genetic elements and study their evolution. Most studies tacitly equate homology of functional elements with sequence homology. This assumption is violated by the phenomenon of turnover, in which functionally equivalent elements reside at locations that are nonorthologous at the sequence level.

View Article and Find Full Text PDF

The international FANTOM consortium aims to produce a comprehensive picture of the mammalian transcriptome, based upon an extensive cDNA collection and functional annotation of full-length enriched cDNAs. The previous dataset, FANTOM2, comprised 60,770 full-length enriched cDNAs. Functional annotation revealed that this cDNA dataset contained only about half of the estimated number of mouse protein-coding genes, indicating that a number of cDNAs still remained to be collected and identified.

View Article and Find Full Text PDF

Using the two largest collections of Mus musculus and Homo sapiens transcription start sites (TSSs) determined based on CAGE tags, ditags, full-length cDNAs, and other transcript data, we describe the compositional landscape surrounding TSSs with the aim of gaining better insight into the properties of mammalian promoters. We classified TSSs into four types based on compositional properties of regions immediately surrounding them. These properties highlighted distinctive features in the extended core promoters that helped us delineate boundaries of the transcription initiation domain space for both species.

View Article and Find Full Text PDF

Short proteins play key roles in cell signalling and other processes, but their abundance in the mammalian proteome is unknown. Current catalogues of mammalian proteins exhibit an artefactual discontinuity at a length of 100 aa, so that protein abundance peaks just above this length and falls off sharply below it. To clarify the abundance of short proteins, we identify proteins in the FANTOM collection of mouse cDNAs by analysing synonymous and non-synonymous substitutions with the computer program CRITICA.

View Article and Find Full Text PDF

Mammalian genomes harbor a larger than expected number of complex loci, in which multiple genes are coupled by shared transcribed regions in antisense orientation and/or by bidirectional core promoters. To determine the incidence, functional significance, and evolutionary context of mammalian complex loci, we identified and characterized 5,248 cis-antisense pairs, 1,638 bidirectional promoters, and 1,153 chains of multiple cis-antisense and/or bidirectionally promoted pairs from 36,606 mouse transcriptional units (TUs), along with 6,141 cis-antisense pairs, 2,113 bidirectional promoters, and 1,480 chains from 42,887 human TUs. In both human and mouse, 25% of TUs resided in cis-antisense pairs, only 17% of which were conserved between the two organisms, indicating frequent species specificity of antisense gene arrangements.

View Article and Find Full Text PDF

Membrane organization describes the orientation of a protein with respect to the membrane and can be determined by the presence, or absence, and organization within the protein sequence of two features: endoplasmic reticulum signal peptides and alpha-helical transmembrane domains. These features allow protein sequences to be classified into one of five membrane organization categories: soluble intracellular proteins, soluble secreted proteins, type I membrane proteins, type II membrane proteins, and multi-spanning membrane proteins. Generation of protein isoforms with variable membrane organizations can change a protein's subcellular localization or association with the membrane.

View Article and Find Full Text PDF

One of the most common splice variations are small exon length variations caused by the use of alternative donor or acceptor splice sites that are in very close proximity on the pre-mRNA. Among these, three-nucleotide variations at so-called NAGNAG tandem acceptor sites have recently attracted considerable attention, and it has been suggested that these variations are regulated and serve to fine-tune protein forms by the addition or removal of a single amino acid. In this paper we first show that in-frame exon length variations are generally overrepresented and that this overrepresentation can be quantitatively explained by the effect of nonsense-mediated decay.

View Article and Find Full Text PDF

With the advancement of genome research, it is becoming clear that genes are not distributed on the genome in random order. Clusters of genes distributed at localized genome positions have been reported in several eukaryotes. Various correlations have been observed between the expressions of genes in adjacent or nearby positions along the chromosomes depending on tissue type and developmental stage.

View Article and Find Full Text PDF

Non-protein-coding RNAs (ncRNAs) are increasingly being recognized as having important regulatory roles. Although much recent attention has focused on tiny 22- to 25-nucleotide microRNAs, several functional ncRNAs are orders of magnitude larger in size. Examples of such macro ncRNAs include Xist and Air, which in mouse are 18 and 108 kilobases (Kb), respectively.

View Article and Find Full Text PDF

We have surveyed the evolutionary trends of mammalian promoters and upstream sequences, utilising large sets of experimentally supported transcription start sites (TSSs). With 30,969 well-defined TSSs from mouse and 26,341 from human, there are sufficient numbers to draw statistically meaningful conclusions and to consider differences between promoter types. Unlike previous smaller studies, we have considered the effects of insertions, deletions, and transposable elements as well as nucleotide substitutions.

View Article and Find Full Text PDF

The mammalian transcriptome harbours shadowy entities that resist classification and analysis. In analogy with pseudogenes, we define pseudo-messenger RNA to be RNA molecules that resemble protein-coding mRNA, but cannot encode full-length proteins owing to disruptions of the reading frame. Using a rigorous computational pipeline, which rules out sequencing errors, we identify 10,679 pseudo-messenger RNAs (approximately half of which are transposon-associated) among the 102,801 FANTOM3 mouse cDNAs: just over 10% of the FANTOM3 transcriptome.

View Article and Find Full Text PDF