Draft-and-Audit Reinforcement Learning for Optimization Modeling

Natural language to optimization requires translating unstructured text into executable mathematical models. Beyond simple syntax errors, this task suffers from silent modeling failures, where incorrect formulations execute successfully but yield invalid results. We propose Draft-and-Audit RL (DA-RL), a framework that learns optimization modeling as a two-step iterative workflow.

April 2026 · Zeping Min, Weihang Xu, Ricky Zhengzhong You, Wotao Yin, Xinshang Wang