Deep reinforcement learning for the direct optimization of gradient separations in liquid chromatography.

Alexander Kensert Pieter Libin Gert Desmet Deirdre Cabooter

J Chromatogr A

University of Leuven (KU Leuven), Department for Pharmaceutical and Pharmacological Sciences, Pharmaceutical Analysis, Herestraat 49, 3000 Leuven, Belgium. Electronic address:

Published: April 2024

While Reinforcement Learning (RL) has already proven successful in performing complex tasks, such as controlling large-scale epidemics, mitigating influenza and playing computer games beyond expert level, it is currently largely unexplored in the field of separation sciences. This paper therefore aims to introduce RL, specifically proximal policy optimization (PPO), in liquid chromatography, and evaluate whether it can be trained to optimize separations directly, based solely on the outcome of a single generic separation as input, and a reward signal based on the resolution between peak pairs (taking a value between [-1,1]). More specifically, PPO algorithms or agents were trained to select linear (1-segment) or multi-segment (2-, 3-, or 16-segment) gradients in 1 experiment, based on the outcome of an initial, generic linear gradient (ϕ=0.3, ϕ=1.0, and t=20min), to improve separations. The size of the mixtures to be separated varied between 10 and 20 components. Furthermore, two agents, selecting 16-segment gradients, were trained to perform this optimization using either 2 or 3 experiments, in sequence, to investigate whether the agents could improve separations further, based on previous outcomes. Results showed that the PPO agent can improve separations given the outcome of one generic scouting run as input, by selecting ϕ-programs tailored to the mixture under consideration. Allowing agents more freedom in selecting multi-segment gradients increased the reward from 0.891 to 0.908 on average; and allowing the agents to perform an additional experiment increased the reward from 0.908 to 0.918 on average. Finally, the agent outperformed random experiments as well as standard experiments (ϕ=0.0, ϕ=1.0, and t=20min) significantly; as random experiments resulted in average rewards between 0.220 and 0.283, and standard experiments resulted in average rewards of 0.840. In conclusion, while there is room for improvement, the results demonstrate the potential of RL in chromatography and present an interesting future direction for the automated optimization of separations.

Download full-text PDF	Source
http://dx.doi.org/10.1016/j.chroma.2024.464768	DOI Listing

Publication Analysis

Top Keywords

improve separations

reinforcement learning

liquid chromatography

16-segment gradients

ϕ=10 t=20min

allowing agents

increased reward

random experiments

standard experiments

experiments average

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!