Systematic curation of protein and genetic interaction data for computable biology.

BMC Biol

Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, USA.

Published: April 2013

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3626917PMC
http://dx.doi.org/10.1186/1741-7007-11-43DOI Listing

Publication Analysis

Top Keywords

systematic curation
4
curation protein
4
protein genetic
4
genetic interaction
4
interaction data
4
data computable
4
computable biology
4
systematic
1
protein
1
genetic
1

Similar Publications

Efficient computational screenings are integral to materials discovery in highly sought-after gas adsorption and storage applications, such as CO capture. Preprocessing techniques have been developed to render experimental crystal structures suitable for molecular simulations by mimicking experimental activation protocols, particularly residual solvent removal. Current accounts examining these preprocessed materials databases indicate the presence of assorted structural errors introduced by solvent removal and preprocessing, including improper elimination of charge-balancing ions and ligands.

View Article and Find Full Text PDF

Background: Traditional rule-based natural language processing approaches in electronic health record systems are effective but are often time-consuming and prone to errors when handling unstructured data. This is primarily due to the substantial manual effort required to parse and extract information from diverse types of documentation. Recent advancements in large language model (LLM) technology have made it possible to automatically interpret medical context and support pathologic staging.

View Article and Find Full Text PDF

Purpose: Breast cancer relapses are rarely collected by cancer registries because of logistical and financial constraints. Hence, we investigated natural language processing (NLP), enhanced with state-of-the-art deep learning transformer tools and large language models, to automate relapse identification in the text of computed tomography (CT) reports.

Methods: We analyzed follow-up CT reports from patients diagnosed with breast cancer between January 1, 2005, and December 31, 2014.

View Article and Find Full Text PDF

GRAViTy-V2: a grounded viral taxonomy application.

NAR Genom Bioinform

December 2024

Peter Medawar Building for Pathogen Research, Nuffield Department of Medicine, University of Oxford, 3 South Parks Road, OX1 3SY Oxfordshire, UK.

Taxonomic classification of viruses is essential for understanding their evolution. Genomic classification of viruses at higher taxonomic ranks, such as order or phylum, is typically based on alignment and comparison of amino acid sequence motifs in conserved genes. Classification at lower taxonomic ranks, such as genus or species, is usually based on nucleotide sequence identities between genomic sequences.

View Article and Find Full Text PDF

From text to insight: large language models for chemical data extraction.

Chem Soc Rev

December 2024

Laboratory of Organic and Macromolecular Chemistry (IOMC), Friedrich Schiller University Jena, Humboldtstrasse 10, 07743 Jena, Germany.

The vast majority of chemical knowledge exists in unstructured natural language, yet structured data is crucial for innovative and systematic materials design. Traditionally, the field has relied on manual curation and partial automation for data extraction for specific use cases. The advent of large language models (LLMs) represents a significant shift, potentially enabling non-experts to extract structured, actionable data from unstructured text efficiently.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!