A growing body of work is devoted to the extraction of protein or gene interaction information from the scientific literature. Yet, the basis for most extraction algorithms, i.e. the specific and sensitive recognition of protein and gene names and their numerous synonyms, has not been adequately addressed. Here we describe the construction of a comprehensive general purpose name dictionary and an accompanying automatic curation procedure based on a simple token model of protein names. We designed an efficient search algorithm to analyze all abstracts in MEDLINE in a reasonable amount of time on standard computers. The parameters of our method are optimized using machine learning techniques. Used in conjunction, these ingredients lead to good search performance. A supplementary web page is available at http://cartan.gmd.de/ProMiner/.
Download full-text PDF |
Source |
---|
Acc Chem Res
January 2025
The Department of Chemistry, State University of New York at Binghamton, Binghamton, New York 13902, United States.
ConspectusIn the search for efficient and selective electrocatalysts capable of converting greenhouse gases to value-added products, enzymes found in naturally existing bacteria provide the basis for most approaches toward electrocatalyst design. Ni,Fe-carbon monoxide dehydrogenase (Ni,Fe-CODH) is one such enzyme, with a nickel-iron-sulfur cluster named the C-cluster, where CO binds and is converted to CO at high rates near the thermodynamic potential. In this Account, we divide the enzyme's catalytic contributions into three categories based on location and function.
View Article and Find Full Text PDFJ Hered
January 2025
The State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences; Peking-Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies; Institute of Ecology, Peking University, Beijing 100871, China.
In the fall of 2003, a two-year-old tiger named Ming, weighing some four hundred pounds, was discovered living in an apartment in Harlem, New York. Ming's rescue by NYPD was witnessed, recalled, and venerated by scores of neighbors. The tiger's history and ancestry stimulated considerable media interest, investigative sleuthing, and forensic genomic analyses.
View Article and Find Full Text PDFUnlabelled: 20-carbon fatty acid-derived eicosanoids are versatile signaling oxylipins in mammals. In particular, a group of eicosanoids termed prostanoids are involved in multiple physiological processes, such as reproduction and immune responses. Although some eicosanoids such as prostaglandin E2 (PGE2) have been detected in some insect species, molecular mechanisms of eicosanoid synthesis and signal transduction in insects have been poorly investigated.
View Article and Find Full Text PDFCurr Res Struct Biol
June 2025
The College of Health Humanities, Jinzhou Medical University, Jinzhou, 121001, China.
The change in the three-dimensional (3D) structure of a protein can affect its own function or interaction with other protein(s), which may lead to disease(s). Gene mutations, especially missense mutations, are the main cause of changes in protein structure. Due to the lack of protein crystal structure data, about three-quarters of human mutant proteins cannot be predicted or accurately predicted, and the pathogenicity of missense mutations can only be indirectly evaluated by evolutionary conservation.
View Article and Find Full Text PDFComput Struct Biotechnol J
December 2024
School of Bioengineering, Qilu University of Technology (Shandong Academy of Sciences), Jinan, Shandong 250300, China.
Protein circular permutations are crucial for understanding protein evolution and functionality. Traditional detection methods face challenges: sequence-based approaches struggle with detecting distant homologs, while structure-based approaches are limited by the need for structure generation and often treat proteins as rigid bodies. Protein Language Model-based alignment tools have shown advantages in utilizing sequence information to overcome the challenges of detecting distant homologs without requiring structural input.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!