Offline RL for generative design of protein binders

Denis Tarasov 1 | Ulrich A. Mbou Sob | Miguel Arbesú | Nima Siboni | Sebastien Boyer | Andries Smit | Oliver Bent | Arnu Pretorius

1 ETH Zurich



Offline Reinforcement Learning (RL) offers a compelling avenue for solving RL problems without the need for interactions with an environment, which may be expensive or unsafe. While online RL methods have found success in various domains, such as de-novo drug generation, they struggle when it comes to optimizing essential properties like drug docking efficiency. The high computational cost associated with the docking process makes it impractical for online RL, which typically requires hundreds of thousands of interactions to learn. In this study, we propose the application of offline RL to address the bottleneck posed by the docking process, leveraging RL’s capability to optimize non-differentiable properties. Our preliminary investigation focuses on using offline RL to generate drugs with improved docking and chemical characteristics.