Publications by Hrant Khachatrian

Publications by authors named "Hrant Khachatrian"

Page 1 of 1

BartSmiles: Generative Masked Language Models for Molecular Representations.

Gayane Chilingaryan Hovhannes Tamoyan Ani Tevosyan Nelly Babayan Karen Hambardzumyan Hrant Khachatrian

J Chem Inf Model

August 2024

We discover a robust self-supervised strategy tailored toward molecular representations for generative masked language models through a series of tailored, in-depth ablations. Using this pretraining strategy, we train BARTSmiles, a BART-like model with an order of magnitude more compute than previous self-supervised molecular representations. In-depth evaluations show that BARTSmiles consistently outperforms other self-supervised representations across classification, regression, and generation tasks, setting a new state-of-the-art on eight tasks.

View Article and Find Full Text PDF

Datasets Construction and Development of QSAR Models for Predicting Micronucleus In Vitro and In Vivo Assay Outcomes.

Lusine Khondkaryan Ani Tevosyan Hayk Navasardyan Hrant Khachatrian Gohar Tadevosyan

Toxics

September 2023

In silico (quantitative) structure-activity relationship modeling is an approach that provides a fast and cost-effective alternative to assess the genotoxic potential of chemicals. However, one of the limiting factors for model development is the availability of consolidated experimental datasets. In the present study, we collected experimental data on micronuclei in vitro and in vivo, utilizing databases and conducting a PubMed search, aided by text mining using the BioBERT large language model.

View Article and Find Full Text PDF

Improving VAE based molecular representations for compound property prediction.

Ani Tevosyan Lusine Khondkaryan Hrant Khachatrian Gohar Tadevosyan Lilit Apresyan

J Cheminform

October 2022

Collecting labeled data for many important tasks in chemoinformatics is time consuming and requires expensive experiments. In recent years, machine learning has been used to learn rich representations of molecules using large scale unlabeled molecular datasets and transfer the knowledge to solve the more challenging tasks with limited datasets. Variational autoencoders are one of the tools that have been proposed to perform the transfer for both chemical property prediction and molecular generation tasks.

View Article and Find Full Text PDF

Multitask learning and benchmarking with clinical time series data.

Hrayr Harutyunyan Hrant Khachatrian David C Kale Greg Ver Steeg Aram Galstyan

Sci Data

June 2019

Health care is one of the most exciting frontiers in data mining and machine learning. Successful adoption of electronic health records (EHRs) created an explosion in digital clinical data available for analysis, but progress in machine learning for healthcare research has been difficult to measure because of the absence of publicly available benchmark data sets. To address this problem, we propose four clinical prediction benchmarks using data derived from the publicly available Medical Information Mart for Intensive Care (MIMIC-III) database.

View Article and Find Full Text PDF