Publications by authors named "Andrew Chatr-Aryamontri"

The coronavirus disease 2019 (COVID-19) pandemic has compelled biomedical researchers to communicate data in real time to establish more effective medical treatments and public health policies. Nontraditional sources such as preprint publications, i.e.

View Article and Find Full Text PDF

The BioGRID (Biological General Repository for Interaction Datasets, thebiogrid.org) is an open-access database resource that houses manually curated protein and genetic interactions from multiple species including yeast, worm, fly, mouse, and human. The ~1.

View Article and Find Full Text PDF

The Precision Medicine Initiative is a multicenter effort aiming at formulating personalized treatments leveraging on individual patient data (clinical, genome sequence and functional genomic data) together with the information in large knowledge bases (KBs) that integrate genome annotation, disease association studies, electronic health records and other data types. The biomedical literature provides a rich foundation for populating these KBs, reporting genetic and molecular interactions that provide the scaffold for the cellular regulatory systems and detailing the influence of genetic variants in these interactions. The goal of BioCreative VI Precision Medicine Track was to extract this particular type of information and was organized in two tasks: (i) document triage task, focused on identifying scientific literature containing experimentally verified protein-protein interactions (PPIs) affected by genetic mutations and (ii) relation extraction task, focused on extracting the affected interactions (protein pairs).

View Article and Find Full Text PDF

The Biological General Repository for Interaction Datasets (BioGRID: https://thebiogrid.org) is an open access database dedicated to the curation and archival storage of protein, genetic and chemical interactions for all major model organism species and humans. As of September 2018 (build 3.

View Article and Find Full Text PDF

The ubiquitin-proteasome system controls the stability of myriad protein substrates via short sequence motifs called degrons. Studies by Koren et al. (2018) and Lin et al.

View Article and Find Full Text PDF

To interrogate genes essential for cell growth, proliferation and survival in human cells, we carried out a genome-wide clustered regularly interspaced short palindromic repeat (CRISPR)/Cas9 screen in a B-cell lymphoma line using a custom extended-knockout (EKO) library of 278,754 single-guide RNAs (sgRNAs) that targeted 19,084 RefSeq genes, 20,852 alternatively spliced exons, and 3,872 hypothetical genes. A new statistical analysis tool called robust analytics and normalization for knockout screens (RANKS) identified 2,280 essential genes, 234 of which were unique. Individual essential genes were validated experimentally and linked to ribosome biogenesis and stress responses.

View Article and Find Full Text PDF

A great deal of information on the molecular genetics and biochemistry of model organisms has been reported in the scientific literature. However, this data is typically described in free text form and is not readily amenable to computational analyses. To this end, the BioGRID database systematically curates the biomedical literature for genetic and protein interaction data.

View Article and Find Full Text PDF

The Biological General Repository for Interaction Datasets (BioGRID: https://thebiogrid.org) is an open access database dedicated to the annotation and archival of protein, genetic and chemical interactions for all major model organism species and humans. As of September 2016 (build 3.

View Article and Find Full Text PDF
Article Synopsis
  • - BioC is an XML format created for biomedical text processing to enhance interoperability, which gained attention during BioCreative IV and was further developed for collaborative tasks in BioCreative V.
  • - The BioCreative V BioC track involved eight subtasks aimed at improving bio-curation, including named entity recognition and interaction identification, with participation from nine global teams contributing new methods and tools.
  • - Feedback from BioGRID curators on the resulting biocurator assistant system was largely positive, emphasizing its user-friendly design and practicality for gene/protein curation through text mining.
View Article and Find Full Text PDF

Fully automated text mining (TM) systems promote efficient literature searching, retrieval, and review but are not sufficient to produce ready-to-consume curated documents. These systems are not meant to replace biocurators, but instead to assist them in one or more literature curation steps. To do so, the user interface is an important aspect that needs to be considered for tool adoption.

View Article and Find Full Text PDF

The Biological General Repository for Interaction Datasets (BioGRID) is a freely available public database that provides the biological and biomedical research communities with curated protein and genetic interaction data. Structured experimental evidence codes, an intuitive search interface, and visualization tools enable the discovery of individual gene, protein, or biological network function. BioGRID houses interaction data for the major model organism species--including yeast, nematode, fly, zebrafish, mouse, and human--with particular emphasis on the budding yeast Saccharomyces cerevisiae and the fission yeast Schizosaccharomyces pombe as pioneer eukaryotic models for network biology.

View Article and Find Full Text PDF

The BioGRID database is an extensive repository of curated genetic and protein interactions for the budding yeast Saccharomyces cerevisiae, the fission yeast Schizosaccharomyces pombe, and the yeast Candida albicans SC5314, as well as for several other model organisms and humans. This protocol describes how to use the BioGRID website to query genetic or protein interactions for any gene of interest, how to visualize the associated interactions using an embedded interactive network viewer, and how to download data files for either selected interactions or the entire BioGRID interaction data set.

View Article and Find Full Text PDF

The Biological General Repository for Interaction Datasets (BioGRID: http://thebiogrid.org) is an open access database that houses genetic and protein interactions curated from the primary biomedical literature for all major model organism species and humans. As of September 2014, the BioGRID contains 749,912 interactions as drawn from 43,149 publications that represent 30 model organisms.

View Article and Find Full Text PDF

The time-consuming nature of manual curation and the rapid growth of biomedical literature severely limit the number of articles that database curators can scrutinize and annotate. Hence, semi-automatic tools can be a valid support to increase annotation throughput. Although a handful of curation assistant tools are already available, to date, little has been done to formally evaluate their benefit to biocuration.

View Article and Find Full Text PDF

In many databases, biocuration primarily involves literature curation, which usually involves retrieving relevant articles, extracting information that will translate into annotations and identifying new incoming literature. As the volume of biological literature increases, the use of text mining to assist in biocuration becomes increasingly relevant. A number of groups have developed tools for text mining from a computer science/linguistics perspective, and there are many initiatives to curate some aspect of biology from the literature.

View Article and Find Full Text PDF

The Biological General Repository for Interaction Datasets (BioGRID: http//thebiogrid.org) is an open access archive of genetic and protein interactions that are curated from the primary biomedical literature for all major model organism species. As of September 2012, BioGRID houses more than 500 000 manually annotated interactions from more than 30 model organisms.

View Article and Find Full Text PDF

Molecular biology has become heavily dependent on biological knowledge encoded in expert curated biological databases. As the volume of biological literature increases, biocurators need help in keeping up with the literature; (semi-) automated aids for biocuration would seem to be an ideal application for natural language processing and text mining. However, to date, there have been few documented successes for improving biocuration throughput using text mining.

View Article and Find Full Text PDF

The International Molecular Exchange (IMEx) consortium is an international collaboration between major public interaction data providers to share literature-curation efforts and make a nonredundant set of protein interactions available in a single search interface on a common website (http://www.imexconsortium.org/).

View Article and Find Full Text PDF

There is an increasing interest in developing ontologies and controlled vocabularies to improve the efficiency and consistency of manual literature curation, to enable more formal biocuration workflow results and ultimately to improve analysis of biological data. Two ontologies that have been successfully used for this purpose are the Gene Ontology (GO) for annotating aspects of gene products and the Molecular Interaction ontology (PSI-MI) used by databases that archive protein-protein interactions. The examination of protein interactions has proven to be extremely promising for the understanding of cellular processes.

View Article and Find Full Text PDF

Background: The BioCreative challenge evaluation is a community-wide effort for evaluating text mining and information extraction systems applied to the biological domain. The biocurator community, as an active user of biomedical literature, provides a diverse and engaged end user group for text mining tools. Earlier BioCreative challenges involved many text mining teams in developing basic capabilities relevant to biological curation, but they did not address the issues of system usage, insertion into the workflow and adoption by curators.

View Article and Find Full Text PDF

Background: Determining usefulness of biomedical text mining systems requires realistic task definition and data selection criteria without artificial constraints, measuring performance aspects that go beyond traditional metrics. The BioCreative III Protein-Protein Interaction (PPI) tasks were motivated by such considerations, trying to address aspects including how the end user would oversee the generated output, for instance by providing ranked results, textual evidence for human interpretation or measuring time savings by using automated systems. Detecting articles describing complex biological events like PPIs was addressed in the Article Classification Task (ACT), where participants were asked to implement tools for detecting PPI-describing abstracts.

View Article and Find Full Text PDF

Background: The vast amount of data published in the primary biomedical literature represents a challenge for the automated extraction and codification of individual data elements. Biological databases that rely solely on manual extraction by expert curators are unable to comprehensively annotate the information dispersed across the entire biomedical literature. The development of efficient tools based on natural language processing (NLP) systems is essential for the selection of relevant publications, identification of data attributes and partially automated annotation.

View Article and Find Full Text PDF

Linear motifs are short, evolutionarily plastic components of regulatory proteins and provide low-affinity interaction interfaces. These compact modules play central roles in mediating every aspect of the regulatory functionality of the cell. They are particularly prominent in mediating cell signaling, controlling protein turnover and directing protein localization.

View Article and Find Full Text PDF

Orchestration of signaling, photoreceptor structural integrity, and maintenance needed for mammalian vision remain enigmatic. By integrating three proteomic data sets, literature mining, computational analyses, and structural information, we have generated a multiscale signal transduction network linked to the visual G protein-coupled receptor (GPCR) rhodopsin, the major protein component of rod outer segments. This network was complemented by domain decomposition of protein-protein interactions and then qualified for mutually exclusive or mutually compatible interactions and ternary complex formation using structural data.

View Article and Find Full Text PDF