Automated Program Refinement: Guide and Verify Code Large Language Model with Refinement Calculus
File version
Version of Record (VoR)
Author(s)
Hou, Zhe
Sanan, David
Luan, Xiaokun
Lin, Yun
Sun, Jun
Dong, Jin Song
Griffith University Author(s)
Primary Supervisor
Other Supervisors
Editor(s)
Date
Size
File type(s)
Location
Abstract
Recently, the rise of code-centric Large Language Models (LLMs) has reshaped the software engineering world with low-barrier tools like Copilot that can easily generate code. However, there is no correctness guarantee for the code generated by LLMs, which suffer from the hallucination problem, and their output is fraught with risks. Besides, the end-to-end process from specification to code through LLMs is a non-transparent and uncontrolled black box. This opacity makes it difficult for users to understand and trust the generated code. Addressing these challenges is both necessary and critical. In contrast, program refinement transforms high-level specification statements into executable code while preserving correctness. Traditional tools for program refinement are primarily designed for formal methods experts and lack automation and extensibility. We apply program refinement to guide LLM and validate the LLM-generated code while transforming refinement into a more accessible and flexible framework. To initiate this vision, we propose Refine4LLM, an approach that aims to: (1) Formally refine the specifications, (2) Automatically prompt and guide the LLM using refinement calculus, (3) Interact with the LLM to generate the code, (4) Verify that the generated code satisfies the constraints, thus guaranteeing its correctness, (5) Learn and build more advanced refinement laws to extend the refinement calculus. We evaluated Refine4LLM against the state-of-the-art baselines on program refinement and LLMs benchmarks.The experiment results show that Refine4LLM can efficiently generate more robust code and reduce the time for refinement and verification.
To initiate this vision, we propose Refine4LLM, an approach that aims to:
(1) Formally refine the specifications,
(2) Automatically prompt and guide the LLM using refinement calculus,
(3) Interact with the LLM to generate the code,
(4) Verify that the generated code satisfies the constraints, thus guaranteeing its correctness,
(5) Learn and build more advanced refinement laws to extend the refinement calculus.
We evaluated Refine4LLM against the state-of-the-art baselines on program refinement and LLMs benchmarks.The experiment results show that Refine4LLM can efficiently generate more robust code and reduce the time for refinement and verification.</jats:p>
Journal Title
Proceedings of the ACM on Programming Languages
Conference Title
Book Title
Edition
Volume
9
Issue
POPL
Thesis Type
Degree Program
School
Publisher link
DOI
Patent number
Funder(s)
Grant identifier(s)
Rights Statement
Rights Statement
© 2025 Owner/Author. This work is licensed under a Creative Commons Attribution International 4.0 License.
Item Access Status
Note
Access the data
Related item(s)
Subject
Software engineering
Theory of computation
Numerical and computational mathematics
Persistent link to this record
Citation
Cai, Y; Hou, Z; Sanan, D; Luan, X; Lin, Y; Sun, J; Dong, JS, Automated Program Refinement: Guide and Verify Code Large Language Model with Refinement Calculus, Proceedings of the ACM on Programming Languages, 2025, 9 (POPL), pp. 69