As practitioners of machine learning in the area of bioinformatics we know that the quality of the results crucially depends on the quality of our labeled data. While there is a tendency to focus on the quality of positive examples, the negative examples are equally as important. In this opinion paper we revisit the problem of choosing negative examples for the task of predicting protein-protein interactions, either among proteins of a given species or for host-pathogen interactions and describe important issues that are prevalent in the current literature. The challenge in creating datasets for this task is the noisy nature of the experimentally derived interactions and the lack of information on non-interacting proteins. A standard approach is to choose random pairs of non-interacting proteins as negative examples. Since the interactomes of all species are only partially known, this leads to a very small percentage of false negatives. This is especially true for host-pathogen interactions. To address this perceived issue, some researchers have chosen to select negative examples as pairs of proteins whose sequence similarity to the positive examples is sufficiently low. This clearly reduces the chance for false negatives, but also makes the problem much easier than it really is, leading to over-optimistic accuracy estimates. We demonstrate the effect of this form of bias using a selection of recent protein interaction prediction methods of varying complexity, and urge researchers to pay attention to the details of generating their datasets for potential biases like this.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9798088PMC
http://dx.doi.org/10.3389/fbinf.2022.1083292DOI Listing

Publication Analysis

Top Keywords

negative examples
20
positive examples
8
host-pathogen interactions
8
non-interacting proteins
8
false negatives
8
examples
7
interactions
5
choice negative
4
examples prediction
4
prediction host-pathogen
4

Similar Publications

Examples of long-range gene regulation in bacteria are rare and generally thought to involve DNA looping. Here, using a combination of biophysical approaches including X-ray crystallography and single-molecule analysis for the KorB-KorA system in Escherichia coli, we show that long-range gene silencing on the plasmid RK2, a source of multi-drug resistance across diverse Gram-negative bacteria, is achieved cooperatively by a DNA-sliding clamp, KorB, and a clamp-locking protein, KorA. We show that KorB is a CTPase clamp that can entrap and slide along DNA to reach distal target promoters up to 1.

View Article and Find Full Text PDF

Purging the Judiciary After a Transition: Between a Rock and a Hard Place.

Hague J Rule Law

March 2024

Department of Constitutional Law and Political Science, Faculty of Law, Masaryk University, Brno, Czech Republic.

Judges play a key role in the implementation of transitional justice mechanisms. Yet, less attention has been paid so far to the question of how to address their collaboration with non-democratic regimes. In theory, judges can be subjected to virtually all transitional justice mechanisms ranging from criminal prosecution and lustration to truth-seeking, or even amnesties.

View Article and Find Full Text PDF

Background: A number of efforts have been made to tailor behavioral healthcare treatments to the variable needs of patients with low back pain (LBP). The most common approach involves the STarT Back Screening Tool (SBST) to triage the need for psychologically informed care, which explores concerns about pain and addresses unhelpful beliefs, attitudes, and behaviors. Such beliefs that pain always signifies injury or tissue damage and that exercise should be avoided have been implied as psychosocial mediators of chronic pain and can impede recovery.

View Article and Find Full Text PDF

[Happiness-A concept for holistic person-centered healthcare in dermatology].

Dermatologie (Heidelb)

January 2025

Klinik und Poliklinik für Dermatologie und Allergologie, Klinikum rechts der Isar, Technische Universität München, TUM School of Medicine and Health, Biedersteiner Str. 29, 80802, München, Deutschland.

Background: Happiness is a concept in positive psychology. Studies have shown links between happiness, courses of diseases and health. In dermatology the role of happiness has not yet been sufficiently investigated.

View Article and Find Full Text PDF

Insulin/IGF signaling (IIS) regulates developmental and metabolic plasticity. Conditional regulation of insulin-like peptide expression and secretion promotes different phenotypes in different environments. However, IIS can also be regulated by other, less-understood mechanisms.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!