Publications by authors named "Richard Bruskiewich"

Bridging the gap between genetic variations, environmental determinants, and phenotypic outcomes is critical for supporting clinical diagnosis and understanding mechanisms of diseases. It requires integrating open data at a global scale. The Monarch Initiative advances these goals by developing open ontologies, semantic data models, and knowledge graphs for translational research.

View Article and Find Full Text PDF

Motivation: Knowledge graphs (KGs) are a powerful approach for integrating heterogeneous data and making inferences in biology and many other domains, but a coherent solution for constructing, exchanging, and facilitating the downstream use of KGs is lacking.

Results: Here we present KG-Hub, a platform that enables standardized construction, exchange, and reuse of KGs. Features include a simple, modular extract-transform-load pattern for producing graphs compliant with Biolink Model (a high-level data model for standardizing biological data), easy integration of any OBO (Open Biological and Biomedical Ontologies) ontology, cached downloads of upstream data sources, versioned and automatically updated builds with stable URLs, web-browsable storage of KG artifacts on cloud infrastructure, and easy reuse of transformed subgraphs across projects.

View Article and Find Full Text PDF

Within clinical, biomedical, and translational science, an increasing number of projects are adopting graphs for knowledge representation. Graph-based data models elucidate the interconnectedness among core biomedical concepts, enable data structures to be easily updated, and support intuitive queries, visualizations, and inference algorithms. However, knowledge discovery across these "knowledge graphs" (KGs) has remained difficult.

View Article and Find Full Text PDF

The API and associated software is open source and currently available for access at https://github.com/NCATS-Tangerine/translator-knowledge-beacon.

View Article and Find Full Text PDF

High-throughput experimental methods such as medical sequencing and genome-wide association studies (GWAS) identify increasingly large numbers of potential relations between genetic variants and diseases. Both biological complexity (millions of potential gene-disease associations) and the accelerating rate of data production necessitate computational approaches to prioritize and rationalize potential gene-disease relations. Here, we use concept profile technology to expose from the biomedical literature both explicitly stated gene-disease relations (the explicitome) and a much larger set of implied gene-disease associations (the implicitome).

View Article and Find Full Text PDF

Next-generation sequencing plays a central role in the characterization and quantification of transcriptomes. Although numerous metrics are purported to quantify the quality of RNA, there have been no large-scale empirical evaluations of the major determinants of sequencing success. We used a combination of existing and newly developed methods to isolate total RNA from 1115 samples from 695 plant species in 324 families, which represents >900 million years of phylogenetic diversity from green algae through flowering plants, including many plants of economic importance.

View Article and Find Full Text PDF

The molecular mechanisms governing PEPC expression in maize remain to be fully defined. Differential methylation of a region in the PEPC promoter has been shown to correlate with transcript accumulation, however, to date, investigations into the role of DNA methylation in maize PEPC expression have relied on the use of methylation-sensitive restriction enzymes. Bisulphite sequencing was used here to provide a single-base resolution methylation map of the maize PEPC promoter.

View Article and Find Full Text PDF

Background: Several tools are available to identify miRNAs from deep-sequencing data, however, only a few of them, like miRDeep, can identify novel miRNAs and are also available as a standalone application. Given the difference between plant and animal miRNAs, particularly in terms of distribution of hairpin length and the nature of complementarity with its duplex partner (or miRNA star), the underlying (statistical) features of miRDeep and other tools, using similar features, are likely to get affected.

Results: The potential effects on features, such as minimum free energy, stability of secondary structures, excision length, etc.

View Article and Find Full Text PDF

Panicle exsertion, an essential physiological process for obtaining high grain yield in rice is mainly driven by peduncle (uppermost internode) elongation. Drought at heading/panicle emergence prevented peduncle elongation from reaching its maximum length even after re-watering. This inhibitory effect of drought resulted in delayed heading and trapping spikelets lower down the panicle inside the flag-leaf sheath, thus increasing sterility in the lower un-exserted spikelets and also among the upper superior spikelets whose exsertion was delayed.

View Article and Find Full Text PDF

Web services have become a key technology for bioinformatics, since life science databases are globally decentralized and the exponential increase in the amount of available data demands for efficient systems without the need to transfer entire databases for every step of an analysis. However, various incompatibilities among database resources and analysis services make it difficult to connect and integrate these into interoperable workflows. To resolve this situation, we invited domain specialists from web service providers, client software developers, Open Bio* projects, the BioMoby project and researchers of emerging areas where a standard exchange data format is not well established, for an intensive collaboration entitled the BioHackathon 2008.

View Article and Find Full Text PDF

Background: The transcriptional regulatory network involved in low temperature response leading to acclimation has been established in Arabidopsis. In japonica rice, which can only withstand transient exposure to milder cold stress (10 degrees C), an oxidative-mediated network has been proposed to play a key role in configuring early responses and short-term defenses. The components, hierarchical organization and physiological consequences of this network were further dissected by a systems-level approach.

View Article and Find Full Text PDF

Background And Aims: Agricultural crop databases maintained in gene banks of the Consultative Group on International Agricultural Research (CGIAR) are valuable sources of information for breeders. These databases provide comparative phenotypic and genotypic information that can help elucidate functional aspects of plant and agricultural biology. To facilitate data sharing within and between these databases and the retrieval of information, the crop ontology (CO) database was designed to provide controlled vocabulary sets for several economically important plant species.

View Article and Find Full Text PDF

Rice, the primary source of dietary calories for half of humanity, is the first crop plant for which a high-quality reference genome sequence from a single variety was produced. We used resequencing microarrays to interrogate 100 Mb of the unique fraction of the reference genome for 20 diverse varieties and landraces that capture the impressive genotypic and phenotypic diversity of domesticated rice. Here, we report the distribution of 160,000 nonredundant SNPs.

View Article and Find Full Text PDF

RiceGeneThresher is a public online resource for mining genes underlying genome regions of interest or quantitative trait loci (QTL) in rice genome. It is a compendium of rice genomic resources consisting of genetic markers, genome annotation, expressed sequence tags (ESTs), protein domains, gene ontology, plant stress-responsive genes, metabolic pathways and prediction of protein-protein interactions. RiceGeneThresher system integrates these diverse data sources and provides powerful web-based applications, and flexible tools for delivering customized set of biological data on rice.

View Article and Find Full Text PDF

This brief meeting review summarizes the recommendations of NSF and NPGI funded bioinformaticians concerning the future requirements for plant bioinformatics systems and databases.

View Article and Find Full Text PDF

The Generation Challenge programme (GCP) is a global crop research consortium directed toward crop improvement through the application of comparative biology and genetic resources characterization to plant breeding. A key consortium research activity is the development of a GCP crop bioinformatics platform to support GCP research. This platform includes the following: (i) shared, public platform-independent domain models, ontology, and data formats to enable interoperability of data and analysis flows within the platform; (ii) web service and registry technologies to identify, share, and integrate information across diverse, globally dispersed data sources, as well as to access high-performance computational (HPC) facilities for computationally intensive, high-throughput analyses of project data; (iii) platform-specific middleware reference implementations of the domain model integrating a suite of public (largely open-access/-source) databases and software tools into a workbench to facilitate biodiversity analysis, comparative analysis of crop genomic data, and plant breeding decision making.

View Article and Find Full Text PDF

Background: Information on more than 35 000 full-length Oryza sativa cDNAs, together with associated microarray gene expression data collected under various treatment conditions, has made it feasible to identify motifs that are conserved in gene promoters and may act as cis-regulatory elements with key roles under the various conditions.

Results: We have developed a novel tool that searches for cis-element candidates in the upstream, downstream, or coding regions of differentially regulated genes. The tool first lists cis-element candidates by motif searching based on the supposition that if there are cis-elements playing important roles in the regulation of a given set of genes, they will be statistically overrepresented and will be conserved.

View Article and Find Full Text PDF

Passport and phenotypic data on germplasm and breeding lines are available from worldwide sources in various electronic formats. These data can be collated into a single database format to enable strategic interrogation to make the best use of data for effective germplasm use and enhancement. The International Crop Information System (http://www.

View Article and Find Full Text PDF

The BioMoby project was initiated in 2001 from within the model organism database community. It aimed to standardize methodologies to facilitate information exchange and access to analytical resources, using a consensus driven approach. Six years later, the BioMoby development community is pleased to announce the release of the 1.

View Article and Find Full Text PDF

The Rice Annotation Project Database (RAP-DB) was created to provide the genome sequence assembly of the International Rice Genome Sequencing Project (IRGSP), manually curated annotation of the sequence, and other genomics information that could be useful for comprehensive understanding of the rice biology. Since the last publication of the RAP-DB, the IRGSP genome has been revised and reassembled. In addition, a large number of rice-expressed sequence tags have been released, and functional genomics resources have been produced worldwide.

View Article and Find Full Text PDF

The Generation Challenge Programme (GCP; www.generationcp.org) has developed an online resource documenting stress-responsive genes comparatively across plant species.

View Article and Find Full Text PDF

We present here the annotation of the complete genome of rice Oryza sativa L. ssp. japonica cultivar Nipponbare.

View Article and Find Full Text PDF

The Generation Challenge Programme (GCP) is an international research consortium striving to apply molecular biological advances to crop improvement for developing countries. Central to its activities is the creation of a next generation global crop information platform and network to share genetic resources, genomics, and crop improvement information. This system is being designed based on a comprehensive scientific domain object model and associated shared ontology.

View Article and Find Full Text PDF

Ambiguous germplasm identification; difficulty in tracing pedigree information; and lack of integration between genetic resources, characterization, breeding, evaluation, and utilization data are constraints in developing knowledge-intensive crop improvement programs. To address these constraints, the International Crop Information System (www.icis.

View Article and Find Full Text PDF