Motivation: Machine learning-based scoring functions (MLBSFs) have been found to exhibit inconsistent performance on different benchmarks and be prone to learning dataset bias. For the field to develop MLBSFs that learn a generalizable understanding of physics, a more rigorous understanding of how they perform is required.
Results: In this work, we compared the performance of a diverse set of popular MLBSFs (RFScore, SIGN, OnionNet-2, Pafnucy, and PointVS) to our proposed baseline models that can only learn dataset biases on a range of benchmarks.
Many studies have prophesied that the integration of machine learning techniques into small-molecule therapeutics development will help to deliver a true leap forward in drug discovery. However, increasingly advanced algorithms and novel architectures have not always yielded substantial improvements in results. In this Perspective, we propose that a greater focus on the data for training and benchmarking these models is more likely to drive future improvement, and explore avenues for future research and strategies to address these data challenges.
View Article and Find Full Text PDFThe surface proteins of the probiotic Propionibacterium freudenreichii were inventoried by an integrative approach that combines in silico protein localization prediction, surface protein extraction, shaving and fluorescent CyDye labeling. Proteins that were extracted and/or shaved and/or labeled were identified by nano-LC-MS/MS following trypsinolysis. This method's combination allowed to confirm detection of true surface proteins involved in host/probiotic interactions.
View Article and Find Full Text PDFUnlabelled: Propionibacterium freudenreichii is a beneficial bacterium used in the food industry as a vitamin producer, as a bio-preservative, as a cheese ripening starter and as a probiotic. It is known to adhere to intestinal epithelial cells and mucus and to modulate important functions of the gut mucosa, including cell proliferation and immune response. Adhesion of probiotics and cross-talk with the host rely on the presence of key surface proteins, still poorly identified.
View Article and Find Full Text PDF