<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/"><channel><title>Optimization Modeling on Ricky Zhengzhong You</title><link>https://zhengzhong-you.github.io/tags/optimization-modeling/</link><description>Recent content in Optimization Modeling on Ricky Zhengzhong You</description><generator>Hugo -- 0.147.2</generator><language>en</language><lastBuildDate>Thu, 30 Apr 2026 12:17:00 -0400</lastBuildDate><atom:link href="https://zhengzhong-you.github.io/tags/optimization-modeling/index.xml" rel="self" type="application/rss+xml"/><item><title>Draft-and-Audit Reinforcement Learning for Optimization Modeling</title><link>https://zhengzhong-you.github.io/papers/draft-and-audit-reinforcement-learning-for-optimization-modeling/</link><pubDate>Thu, 30 Apr 2026 12:17:00 -0400</pubDate><guid>https://zhengzhong-you.github.io/papers/draft-and-audit-reinforcement-learning-for-optimization-modeling/</guid><description>&lt;p>Accepted as a regular paper at &lt;em>ICML 2026&lt;/em>.&lt;/p>
&lt;ul>
&lt;li>OpenReview: &lt;a href="https://openreview.net/forum?id=3rzJANFrMp" target="_blank">forum&lt;/a>&lt;/li>
&lt;li>OpenReview PDF: &lt;a href="https://openreview.net/pdf?id=3rzJANFrMp" target="_blank">paper&lt;/a>&lt;/li>
&lt;/ul>
&lt;p>Natural language to optimization requires translating unstructured text into executable mathematical models. Beyond simple syntax errors, this task suffers from silent modeling failures, where incorrect formulations execute successfully but yield invalid results. We propose Draft-and-Audit RL (DA-RL), a framework that learns optimization modeling as a two-step iterative workflow. Unlike inference-time scaffolds that rely on intermediate solver feedback to guide repairs, DA-RL optimizes a shared-parameter policy using terminal-only verification: the model is rewarded solely based on the execution of the final audited program. This constraint forces the model to internalize rubric-guided revision as a learned capability and encourages the emergence of cross-turn synergy, where the policy learns to generate drafts that are structurally amenable to self-correction.&lt;/p></description></item><item><title>ICML 2026 Regular Paper Acceptance</title><link>https://zhengzhong-you.github.io/news/icml-2026-regular-paper-acceptance/</link><pubDate>Thu, 30 Apr 2026 12:17:00 -0400</pubDate><guid>https://zhengzhong-you.github.io/news/icml-2026-regular-paper-acceptance/</guid><description>Draft-and-Audit Reinforcement Learning for Optimization Modeling was accepted as a regular paper at ICML 2026.</description></item></channel></rss>