Publications by Farrokh Mehryary

Publications by authors named "Farrokh Mehryary"

Page 1 of 1

The STRING database in 2025: protein networks with directionality of regulation.

Damian Szklarczyk Katerina Nastou Mikaela Koutrouli Rebecca Kirsch Farrokh Mehryary

Nucleic Acids Res

November 2024

Article Synopsis

Proteins interact in complex ways to perform vital cellular functions, making it crucial to understand these interactions for a comprehensive view of cellular processes.
The STRING database aggregates and scores data on protein-protein associations from various sources, aiming to provide a global network of physical and functional interactions, with tools for network clustering and pathway enrichment.
The latest update, STRING 12.5, introduces a regulatory network feature that analyzes interaction types and directions, along with enhanced pathway enrichment detection and improved visualizations, making the platform more user-friendly for diverse research applications.

View Article and Find Full Text PDF

STRING-ing together protein complexes: corpus and methods for extracting physical protein interactions from the biomedical literature.

Farrokh Mehryary Katerina Nastou Tomoko Ohta Lars Juhl Jensen Sampo Pyysalo

Bioinformatics

September 2024

Motivation: Understanding biological processes relies heavily on curated knowledge of physical interactions between proteins. Yet, a notable gap remains between the information stored in databases of curated knowledge and the plethora of interactions documented in the scientific literature.

Results: To bridge this gap, we introduce ComplexTome, a manually annotated corpus designed to facilitate the development of text-mining methods for the extraction of complex formation relationships among biomedical entities targeting the downstream semantics of the physical interaction subnetwork of the STRING database.

View Article and Find Full Text PDF

RegulaTome: a corpus of typed, directed, and signed relations between biomedical entities in the scientific literature.

Katerina Nastou Farrokh Mehryary Tomoko Ohta Jouni Luoma Sampo Pyysalo

Database (Oxford)

September 2024

In the field of biomedical text mining, the ability to extract relations from the literature is crucial for advancing both theoretical research and practical applications. There is a notable shortage of corpora designed to enhance the extraction of multiple types of relations, particularly focusing on proteins and protein-containing entities such as complexes and families, as well as chemicals. In this work, we present RegulaTome, a corpus that overcomes the limitations of several existing biomedical relation extraction (RE) corpora, many of which concentrate on single-type relations at the sentence level.

View Article and Find Full Text PDF

Overview of DrugProt task at BioCreative VII: data and methods for large-scale text mining and knowledge graph generation of heterogenous chemical-protein relations.

Antonio Miranda-Escalada Farrokh Mehryary Jouni Luoma Darryl Estrada-Zavala Luis Gasco

Database (Oxford)

November 2023

It is getting increasingly challenging to efficiently exploit drug-related information described in the growing amount of scientific literature. Indeed, for drug-gene/protein interactions, the challenge is even bigger, considering the scattered information sources and types of interactions. However, their systematic, large-scale exploitation is key for developing tools, impacting knowledge fields as diverse as drug design or metabolic pathway research.

View Article and Find Full Text PDF

The STRING database in 2023: protein-protein association networks and functional enrichment analyses for any sequenced genome of interest.

Damian Szklarczyk Rebecca Kirsch Mikaela Koutrouli Katerina Nastou Farrokh Mehryary

Nucleic Acids Res

January 2023

Much of the complexity within cells arises from functional and regulatory interactions among proteins. The core of these interactions is increasingly known, but novel interactions continue to be discovered, and the information remains scattered across different database resources, experimental modalities and levels of mechanistic detail. The STRING database (https://string-db.

View Article and Find Full Text PDF

Neural Network and Random Forest Models in Protein Function Prediction.

Kai Hakala Suwisa Kaewphan Jari Bjorne Farrokh Mehryary Hans Moen

IEEE/ACM Trans Comput Biol Bioinform

June 2022

Over the past decade, the demand for automated protein function prediction has increased due to the volume of newly sequenced proteins. In this paper, we address the function prediction task by developing an ensemble system automatically assigning Gene Ontology (GO) terms to the given input protein sequence. We develop an ensemble system which combines the GO predictions made by random forest (RF) and neural network (NN) classifiers.

View Article and Find Full Text PDF

The CAFA challenge reports improved protein function prediction and new functional annotations for hundreds of genes through experimental screens.

Naihui Zhou Yuxiang Jiang Timothy R Bergquist Alexandra J Lee Balint Z Kacsoh Farrokh Mehryary

Genome Biol

November 2019

Background: The Critical Assessment of Functional Annotation (CAFA) is an ongoing, global, community-driven effort to evaluate and improve the computational annotation of protein function.

Results: Here, we report on the results of the third CAFA challenge, CAFA3, that featured an expanded analysis over the previous CAFA rounds, both in terms of volume of data analyzed and the types of analysis performed. In a novel and major new development, computational predictions and assessment goals drove some of the experimental assays, resulting in new functional annotations for more than 1000 genes.

View Article and Find Full Text PDF

Potent pairing: ensemble of long short-term memory networks and support vector machine for chemical-protein relation extraction.

Farrokh Mehryary Jari Björne Tapio Salakoski Filip Ginter

Database (Oxford)

January 2018

Biomedical researchers regularly discover new interactions between chemical compounds/drugs and genes/proteins, and report them in research literature. Having knowledge about these interactions is crucially important in many research areas such as precision medicine and drug discovery. The BioCreative VI Task 5 (CHEMPROT) challenge promotes the development and evaluation of computer systems that can automatically recognize and extract statements of such interactions from biomedical literature.

View Article and Find Full Text PDF

Data and systems for medication-related text classification and concept normalization from Twitter: insights from the Social Media Mining for Health (SMM4H)-2017 shared task.

Abeed Sarker Maksim Belousov Jasper Friedrichs Kai Hakala Svetlana Kiritchenko Farrokh Mehryary

J Am Med Inform Assoc

October 2018

Objective: We executed the Social Media Mining for Health (SMM4H) 2017 shared tasks to enable the community-driven development and large-scale evaluation of automatic text processing methods for the classification and normalization of health-related text from social media. An additional objective was to publicly release manually annotated data.

Materials And Methods: We organized 3 independent subtasks: automatic classification of self-reports of 1) adverse drug reactions (ADRs) and 2) medication consumption, from medication-mentioning tweets, and 3) normalization of ADR expressions.

View Article and Find Full Text PDF

An expanded evaluation of protein function prediction methods shows an improvement in accuracy.

Yuxiang Jiang Tal Ronnen Oron Wyatt T Clark Asma R Bankapur Daniel D'Andrea Farrokh Mehryary

Genome Biol

September 2016

Background: A major bottleneck in our understanding of the molecular underpinnings of life is the assignment of function to proteins. While molecular experiments provide the most reliable annotation of proteins, their relatively low throughput and restricted purview have led to an increasing role for computational function prediction. However, assessing methods for protein function prediction and tracking progress in the field remain challenging.

View Article and Find Full Text PDF

Filtering large-scale event collections using a combination of supervised and unsupervised learning for event trigger classification.

Farrokh Mehryary Suwisa Kaewphan Kai Hakala Filip Ginter

J Biomed Semantics

November 2017

Background: Biomedical event extraction is one of the key tasks in biomedical text mining, supporting various applications such as database curation and hypothesis generation. Several systems, some of which have been applied at a large scale, have been introduced to solve this task. Past studies have shown that the identification of the phrases describing biological processes, also known as trigger detection, is a crucial part of event extraction, and notable overall performance gains can be obtained by solely focusing on this sub-task.

View Article and Find Full Text PDF

Publications by authors named "Farrokh Mehryary"

The STRING database in 2025: protein networks with directionality of regulation.

Article Synopsis

STRING-ing together protein complexes: corpus and methods for extracting physical protein interactions from the biomedical literature.

RegulaTome: a corpus of typed, directed, and signed relations between biomedical entities in the scientific literature.

Overview of DrugProt task at BioCreative VII: data and methods for large-scale text mining and knowledge graph generation of heterogenous chemical-protein relations.

The STRING database in 2023: protein-protein association networks and functional enrichment analyses for any sequenced genome of interest.

Neural Network and Random Forest Models in Protein Function Prediction.

The CAFA challenge reports improved protein function prediction and new functional annotations for hundreds of genes through experimental screens.

Potent pairing: ensemble of long short-term memory networks and support vector machine for chemical-protein relation extraction.

Data and systems for medication-related text classification and concept normalization from Twitter: insights from the Social Media Mining for Health (SMM4H)-2017 shared task.

An expanded evaluation of protein function prediction methods shows an improvement in accuracy.

Filtering large-scale event collections using a combination of supervised and unsupervised learning for event trigger classification.

A PHP Error was encountered

A PHP Error was encountered