A PHP Error was encountered

Severity: Warning

Message: fopen(/var/lib/php/sessions/ci_sessionnp0col84eifcor28j83pah091fv2kolm): Failed to open stream: No space left on device

Filename: drivers/Session_files_driver.php

Line Number: 177

Backtrace:

File: /var/www/html/index.php
Line: 316
Function: require_once

A PHP Error was encountered

Severity: Warning

Message: session_start(): Failed to read session data: user (path: /var/lib/php/sessions)

Filename: Session/Session.php

Line Number: 137

Backtrace:

File: /var/www/html/index.php
Line: 316
Function: require_once

PubChem synonym filtering process using crowdsourcing. | LitMetric

PubChem synonym filtering process using crowdsourcing.

J Cheminform

National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA.

Published: June 2024

PubChem ( https://pubchem.ncbi.nlm.nih.gov ) is a public chemical information resource containing more than 100 million unique chemical structures. One of the most requested tasks in PubChem and other chemical databases is to search chemicals by name (also commonly called a "chemical synonym"). PubChem performs this task by looking up chemical synonym-structure associations provided by individual depositors to PubChem. In addition, these synonyms are used for many purposes, including creating links between chemicals and PubMed articles (using Medical Subject Headings (MeSH) terms). However, these depositor-provided name-structure associations are subject to substantial discrepancies within and between depositors, making it difficult to unambiguously map a chemical name to a specific chemical structure. The present paper describes PubChem's crowdsourcing-based synonym filtering strategy, which resolves inter- and intra-depositor discrepancies in synonym-structure associations as well as in the chemical-MeSH associations. The PubChem synonym filtering process was developed based on the analysis of four crowd-voting strategies, which differ in the consistency threshold value employed (60% vs 70%) and how to resolve intra-depositor discrepancies (a single vote vs. multiple votes per depositor) prior to inter-depositor crowd-voting. The agreement of voting was determined at six levels of chemical equivalency, which considers varying isotopic composition, stereochemistry, and connectivity of chemical structures and their primary components. While all four strategies showed comparable results, Strategy I (one vote per depositor with a 60% consistency threshold) resulted in the most synonyms assigned to a single chemical structure as well as the most synonym-structure associations disambiguated at the six chemical equivalency contexts. Based on the results of this study, Strategy I was implemented in PubChem's filtering process that cleans up synonym-structure associations as well as chemical-MeSH associations. This consistency-based filtering process is designed to look for a consensus in name-structure associations but cannot attest to their correctness. As a result, it can fail to recognize correct name-structure associations (or incorrect ones), for example, when a synonym is provided by only one depositor or when many contributors are incorrect. However, this filtering process is an important starting point for quality control in name-structure associations in large chemical databases like PubChem.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11181558PMC
http://dx.doi.org/10.1186/s13321-024-00868-3DOI Listing

Publication Analysis

Top Keywords

filtering process
20
synonym-structure associations
16
name-structure associations
16
synonym filtering
12
chemical
11
associations
10
pubchem synonym
8
chemical structures
8
chemical databases
8
chemical structure
8

Similar Publications

EEG mid-frontal theta-band activity (MFT; 4-8 Hz) is of considerable interest as a possible biomarker in translational research on cognitive control. However, because most of the MFT literature has focused on experimental within-subjects effects, the impact of particular data processing choices on individual difference analyses is not well understood. This study aimed to reduce that gap by examining the psychometric properties of different pipelines for measuring individual differences in MFT power and latency.

View Article and Find Full Text PDF

Denoising is one of the most important processes in digital image processing to recover visual quality and structural integrity in images. Traditional methods often suffer from limitations like computational complexity, over-smoothing, and the inability to preserve critical details, particularly edges. This paper introduces a hybrid denoising algorithm combining Adaptive Median Filter (AMF) and Modified Decision-Based Median Filter (MDBMF) to address these challenges.

View Article and Find Full Text PDF

Metabolite Fingerprinting by Infrared Matrix-Assisted Laser Desorption Electrospray Ionization Mass Spectrometry.

SLAS Technol

March 2025

Discovery Research, AbbVie Inc., 1 North Waukegan Road, North Chicago, Illinois 60064, United States. Electronic address:

The adoption of mass spectrometry for high-throughput screening in drug discovery has become increasingly prevalent and has enabled label-free screening against diverse targets. Cellular assays for phenotypic screening, however, are primarily conducted by microscopy as there remain many challenges associated with conducting phenotypic screens via ultra-high throughput mass spectrometry. Following a simple on-plate extraction, infrared matrix-assisted laser desorption electrospray ionization (IR-MALDESI) was employed to directly sample the cell lysate at a speed of one sample per second with high mass resolution.

View Article and Find Full Text PDF

The Same Source of Microbes has a Divergent Assembly Trajectory Along a Hot Spring Flowing Path.

Mol Ecol

March 2025

CAS Key Laboratory for Environmental Biotechnology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences (CAS), Beijing, China.

Hot spring microbial mats represent intricate biofilms that establish self-sustaining ecosystems, hosting diverse microbial communities which facilitate a range of biochemical processes and contribute to the structural and functional complexity of these systems. While community structuring across mat depth has received substantial attention, mechanisms shaping horizontal spatial composition and functional structure of these communities remain understudied. Here, we explored the contributions of species source, local environment and species interaction to microbial community assembly processes in six microbial mat regions following a flow direction with a temperature decreasing from 73.

View Article and Find Full Text PDF

A motor unit action potential-based method for surface electromyography decomposition.

J Neuroeng Rehabil

March 2025

State Key Laboratory of Mechanical System and Vibration, School of Mechanical Engineering, Shanghai Jiao Tong University, Shanghai, 200240, China.

Objective: Surface electromyography (EMG) decomposition is crucial for identifying motor neuron activities by analyzing muscle-generated electrical signals. This study aims to develop and validate a novel motor unit action potential (MUAP)-based method for surface EMG decomposition, addressing the limitations of traditional blind source separation (BSS)-based techniques in computation complexity and motor unit (MU) tracking.

Methods: Within the framework of the convolution kernel compensation algorithm, we developed a MUAP-based decomposition algorithm by reconstructing the MU filters from MUAPs and evaluated its performance using both simulated and experimental datasets.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!