Deep reinforcement learning for the direct optimization of gradient separations in liquid chromatography.

J Chromatogr A

University of Leuven (KU Leuven), Department for Pharmaceutical and Pharmacological Sciences, Pharmaceutical Analysis, Herestraat 49, 3000 Leuven, Belgium. Electronic address:

Published: April 2024

While Reinforcement Learning (RL) has already proven successful in performing complex tasks, such as controlling large-scale epidemics, mitigating influenza and playing computer games beyond expert level, it is currently largely unexplored in the field of separation sciences. This paper therefore aims to introduce RL, specifically proximal policy optimization (PPO), in liquid chromatography, and evaluate whether it can be trained to optimize separations directly, based solely on the outcome of a single generic separation as input, and a reward signal based on the resolution between peak pairs (taking a value between [-1,1]). More specifically, PPO algorithms or agents were trained to select linear (1-segment) or multi-segment (2-, 3-, or 16-segment) gradients in 1 experiment, based on the outcome of an initial, generic linear gradient (ϕ=0.3, ϕ=1.0, and t=20min), to improve separations. The size of the mixtures to be separated varied between 10 and 20 components. Furthermore, two agents, selecting 16-segment gradients, were trained to perform this optimization using either 2 or 3 experiments, in sequence, to investigate whether the agents could improve separations further, based on previous outcomes. Results showed that the PPO agent can improve separations given the outcome of one generic scouting run as input, by selecting ϕ-programs tailored to the mixture under consideration. Allowing agents more freedom in selecting multi-segment gradients increased the reward from 0.891 to 0.908 on average; and allowing the agents to perform an additional experiment increased the reward from 0.908 to 0.918 on average. Finally, the agent outperformed random experiments as well as standard experiments (ϕ=0.0, ϕ=1.0, and t=20min) significantly; as random experiments resulted in average rewards between 0.220 and 0.283, and standard experiments resulted in average rewards of 0.840. In conclusion, while there is room for improvement, the results demonstrate the potential of RL in chromatography and present an interesting future direction for the automated optimization of separations.

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.chroma.2024.464768DOI Listing

Publication Analysis

Top Keywords

improve separations
12
reinforcement learning
8
liquid chromatography
8
16-segment gradients
8
ϕ=10 t=20min
8
allowing agents
8
increased reward
8
random experiments
8
standard experiments
8
experiments average
8

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!