Publications by authors named "Ting-Jun Hou"

Enhanced sampling simulations make the computational study of rare events feasible. A large family of such methods crucially depends on the definition of some collective variables (CVs) that could provide a low-dimensional representation of the relevant physics of the process. Recently, many methods have been proposed to semiautomatize the CV design by using machine learning tools to learn the variables directly from the simulation data.

View Article and Find Full Text PDF

Rare event sampling is a central problem in modern computational chemistry research. Among the existing methods, transition path sampling (TPS) can generate unbiased representations of reaction processes. However, its efficiency depends on the ability to generate reactive trial paths, which in turn depends on the quality of the shooting algorithm used.

View Article and Find Full Text PDF

Prostate cancer (PCa) is the second most prevalent malignancy among men worldwide. The aberrant activation of androgen receptor (AR) signaling has been recognized as a crucial oncogenic driver for PCa and AR antagonists are widely used in PCa therapy. To develop novel AR antagonist, a machine-learning MIEC-SVM model was established for the virtual screening and 51 candidates were selected and submitted for bioactivity evaluation.

View Article and Find Full Text PDF

Identification and validation of bioactive small-molecule targets is a significant challenge in drug discovery. In recent years, various in-silico approaches have been proposed to expedite time- and resource-consuming experiments for target detection. Herein, we developed several chemogenomic models for target prediction based on multi-scale information of chemical structures and protein sequences.

View Article and Find Full Text PDF

The -octanol/buffer solution distribution coefficient at pH = 7.4 (log ) is an indicator of lipophilicity, and it influences a wide variety of absorption, distribution, metabolism, excretion, and toxicity (ADMET) properties and druggability of compounds. In log  prediction, graph neural networks (GNNs) can uncover subtle structure-property relationships (SPRs) by automatically extracting features from molecular graphs that facilitate the learning of SPRs, but their performances are often limited by the small size of available datasets.

View Article and Find Full Text PDF

Identification of potential targets for known bioactive compounds and novel synthetic analogs is of considerable significance. In silico target fishing (TF) has become an alternative strategy because of the expensive and laborious wet-lab experiments, explosive growth of bioactivity data and rapid development of high-throughput technologies. However, these TF methods are based on different algorithms, molecular representations and training datasets, which may lead to different results when predicting the same query molecules.

View Article and Find Full Text PDF

Machine learning-based scoring functions (MLSFs) have become a very favorable alternative to classical scoring functions because of their potential superior screening performance. However, the information of negative data used to construct MLSFs was rarely reported in the literature, and meanwhile the putative inactive molecules recorded in existing databases usually have obvious bias from active molecules. Here we proposed an easy-to-use method named AMLSF that combines active learning using negative molecular selection strategies with MLSF, which can iteratively improve the quality of inactive sets and thus reduce the false positive rate of virtual screening.

View Article and Find Full Text PDF

As a major class of medicine for treating the lethal type of castration-resistant prostate cancer (PCa), long-term use of androgen receptor (AR) antagonists commonly leads to antiandrogen resistance. When AR signaling pathway is blocked by AR-targeted therapy, glucocorticoid receptor (GR) could compensate for AR function especially at the late stage of PCa. AR-GR dual antagonist is expected to be a good solution for this situation.

View Article and Find Full Text PDF

Accurate prediction of pharmacological properties of small molecules is becoming increasingly important in drug discovery. Traditional feature-engineering approaches heavily rely on handcrafted descriptors and/or fingerprints, which need extensive human expert knowledge. With the rapid progress of artificial intelligence technology, data-driven deep learning methods have shown unparalleled advantages over feature-engineering-based methods.

View Article and Find Full Text PDF
Article Synopsis
  • * The study utilized seven machine learning algorithms and various molecular representations, achieving a balanced accuracy of up to 72.6% and an AUC of 76.8% with the best model, indicating effective classification of hematotoxicity.
  • * Advanced techniques like SHAP and matched molecular pair analysis were employed to identify crucial structural features and inform safer drug design processes, highlighting the study's potential as a valuable resource for assessing hematotoxicity in new drugs.
View Article and Find Full Text PDF

Progressive ischemic stroke (PIS) is featured by progressive neurological dysfunction after ischemia. Ischemia-evoked neuroinflammation is implicated in the progressive brain injury after cerebral ischemia, while Caspase-1, an active component of inflammasome, exaggerates ischemic brain injury. Current Caspase-1 inhibitors are inadequate in safety and druggability.

View Article and Find Full Text PDF

Drug-drug interaction (DDI) often causes serious adverse reactions and thus results in inestimable economic and social loss. Currently, comprehensive DDI evaluation has become a major challenge in pharmaceutical research due to the time-consuming and costly process of the experimental assessment and it is of high necessity to develop effective in silico methods to predict and evaluate DDIs accurately and efficiently. In this study, based on a large number of substrates and inhibitors related to five important CYP450 isozymes (CYP1A2, CYP2C9, CYP2C19, CYP2D6 and CYP3A4), a series of high-performance predictive models for metabolic DDIs were constructed by two machine learning methods (random forest and XGBoost) and 4 different types of descriptors (MOE_2D, CATS, ECFP4 and MACCS).

View Article and Find Full Text PDF

Structural information for chemical compounds is often described by pictorial images in most scientific documents, which cannot be easily understood and manipulated by computers. This dilemma makes optical chemical structure recognition (OCSR) an essential tool for automatically mining knowledge from an enormous amount of literature. However, existing OCSR methods fall far short of our expectations for realistic requirements due to their poor recovery accuracy.

View Article and Find Full Text PDF

Synthetic glucocorticoids (GCs) have been widely used in the treatment of a broad range of inflammatory diseases, but their clinic use is limited by undesired side effects such as metabolic disorders, osteoporosis, skin and muscle atrophies, mood disorders and hypothalamic-pituitary-adrenal (HPA) axis suppression. Selective glucocorticoid receptor modulators (SGRMs) are expected to have promising anti-inflammatory efficacy but with fewer side effects caused by GCs. Here, we reported HT-15, a prospective SGRM discovered by structure-based virtual screening (VS) and bioassays.

View Article and Find Full Text PDF

In the process of drug discovery, the optimization of lead compounds has always been a challenge faced by pharmaceutical chemists. Matched molecular pair analysis (MMPA), a promising tool to efficiently extract and summarize the relationship between structural transformation and property change, is suitable for local structural optimization tasks. Especially, the integration of MMPA with QSAR modeling can further strengthen the utility of MMPA in molecular optimization navigation.

View Article and Find Full Text PDF
Article Synopsis
  • DprE1 is a key enzyme in the cell wall biosynthesis of Mycobacterium, making it a target for new tuberculosis (TB) treatments.
  • The study used advanced molecular modeling techniques to identify two promising compounds, B2 and H3, that can inhibit DprE1 and kill Mycobacterium smegmatis in the lab.
  • Notably, compound H3 was found to effectively inhibit Mycobacterium tuberculosis with minimal harm to mouse cells, highlighting its potential as a new anti-TB drug.
View Article and Find Full Text PDF

Macrophage migration inhibitory factor (MIF) is a pluripotent pro-inflammatory cytokine and is related to acute and chronic inflammatory responses, immune disorders, tumors, and other diseases. In this study, an integrated virtual screening strategy and bioassays were used to search for potent MIF inhibitors. Twelve compounds with better bioactivity than the prototypical MIF-inhibitor ISO-1 (IC = 14.

View Article and Find Full Text PDF

Computational methods have become indispensable tools to accelerate the drug discovery process and alleviate the excessive dependence on time-consuming and labor-intensive experiments. Traditional feature-engineering approaches heavily rely on expert knowledge to devise useful features, which could be costly and sometimes biased. The emerging deep learning (DL) methods deliver a data-driven method to automatically learn expressive representations from complex raw data.

View Article and Find Full Text PDF

As one of the central tasks of modern medicinal chemistry, scaffold hopping is expected to lead to the discovery of structural novel biological active compounds and broaden the chemical space of known active compounds. Here, we report the computational bioactivity fingerprint (CBFP) for easier scaffold hopping, where the predicted activities in multiple quantitative structure-activity relationship models are integrated to characterize the biological space of a molecule. In retrospective benchmarks, the CBFP representation shows outstanding scaffold hopping potential relative to other chemical descriptors.

View Article and Find Full Text PDF

Motivation: Accurate and efficient prediction of molecular properties is one of the fundamental issues in drug design and discovery pipelines. Traditional feature engineering-based approaches require extensive expertise in the feature design and selection process. With the development of artificial intelligence (AI) technologies, data-driven methods exhibit unparalleled advantages over the feature engineering-based methods in various domains.

View Article and Find Full Text PDF

Androgen receptor (AR), a ligand-activated transcription factor, is a master regulator in the development and progress of prostate cancer (PCa). A major challenge for the clinically used AR antagonists is the rapid emergence of resistance induced by the mutations at AR ligand binding domain (LBD), and therefore the discovery of novel anti-AR therapeutics that can combat mutation-induced resistance is quite demanding. Therein, blocking the interaction between AR and DNA represents an innovative strategy.

View Article and Find Full Text PDF

Background: Substructure screening is widely applied to evaluate the molecular potency and ADMET properties of compounds in drug discovery pipelines, and it can also be used to interpret QSAR models for the design of new compounds with desirable physicochemical and biological properties. With the continuous accumulation of more experimental data, data-driven computational systems which can derive representative substructures from large chemical libraries attract more attention. Therefore, the development of an integrated and convenient tool to generate and implement representative substructures is urgently needed.

View Article and Find Full Text PDF

In 2010, the pan-assay interference compounds (PAINS) rule was proposed to identify false-positive compounds, especially frequent hitters (FHs), in biological screening campaigns, and has rapidly become an essential component in drug design. However, the specific mechanisms remain unknown, and the result validation and follow-up processing schemes are still unclear. In this review, a large benchmark collection of >600,000 compounds sourced from databases and the literature, including six common false-positive mechanisms, was used to evaluate the detection ability of PAINS.

View Article and Find Full Text PDF

Matched molecular pairs analysis (MMPA) has become a powerful tool for automatically and systematically identifying medicinal chemistry transformations from compound/property datasets. However, accurate determination of matched molecular pair (MMP) transformations largely depend on the size and quality of existing experimental data. Lack of high-quality experimental data heavily hampers the extraction of more effective medicinal chemistry knowledge.

View Article and Find Full Text PDF

Background: Fluorescent detection methods are indispensable tools for chemical biology. However, the frequent appearance of potential fluorescent compound has greatly interfered with the recognition of compounds with genuine activity. Such fluorescence interference is especially difficult to identify as it is reproducible and possesses concentration-dependent characteristic.

View Article and Find Full Text PDF