This paper demonstrates the ability of mach- ine learning approaches to identify a few genes among the 23,398 genes of the human genome to experiment on in the laboratory to establish new drug mechanisms. As a case study, this paper uses MDA-MB-231 breast cancer single-cells treated with the antidiabetic drug metformin. We show that mixture-model-based unsupervised methods with validation from hierarchical clustering can identify single-cell subpopulations (clusters). These clusters are characterized by a small set of genes (1% of the genome) that have significant differential expression across the clusters and are also highly correlated with pathways with anticancer effects driven by metformin. Among the identified small set of genes associated with reduced breast cancer incidence, laboratory experiments on one of the genes, CDC42, showed that its downregulation by metformin inhibited cancer cell migration and proliferation, thus validating the ability of machine learning approaches to identify biologically relevant candidates for laboratory experiments. Given the large size of the human genome and limitations in cost and skilled resources, the broader impact of this work in identifying a small set of differentially expressed genes after drug treatment lies in augmenting the drug-disease knowledge of pharmacogenomics experts in laboratory investigations, which could help establish novel biological mechanisms associated with drug response in diseases beyond breast cancer.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6148350 | PMC |
http://dx.doi.org/10.1109/TNB.2018.2851997 | DOI Listing |
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!