Identifying protein complexes within a protein-protein interaction (PPI) networks is a crucial task in computational biology that helps to facilitate a better understanding of the cellular mechanisms it is possible to observe in various organisms. Datasets of predicted PPIs have been determined using high-throughput experimental technology. However, the datasets typically contain many spurious interactions. It is essential that these interactions, observed in the given datasets, are validated before they are employed to predict protein complexes. This paper describes the identification of missing interactome links in the PPI network as a way of improving the detection of protein complexes. The missing links have been identified by extracting several topological features. These are subsequently employed in conjunction with a two-class boosted decision-tree classifier to develop a machine-learning model that is capable of distinguishing between existing and non-existing interactome links. The model was trained on a PPI network that consisted of 1,622 proteins and 9,074 interactions, then tested on another PPI network that consisted of 1,430 proteins and 6,531 interactions. All 6,531 interactions were identified with a precision of 0.994 and a recall of 1. The model was also able to detect 37 novel interactions that were then validated using a STRING database of known and predicted PPIs. The detection of the protein complexes using CIusterONE was improved by the inclusion of the 37 novel interactions.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1109/EMBC.2018.8513476 | DOI Listing |
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!