Computer Science > Machine Learning

arXiv:2512.15000 (cs)

[Submitted on 17 Dec 2025]

Title:DreamPRM-Code: Function-as-Step Process Reward Model with Label Correction for LLM Coding

Authors:Ruiyi Zhang, Peijia Qin, Qi Cao, Pengtao Xie

Abstract:Process Reward Models (PRMs) have become essential for improving Large Language Models (LLMs) via test-time scaling, yet their effectiveness in coding remains limited due to the lack of meaningful step decompositions in code and the noise of Monte-Carlo-generated partial labels. We propose DreamPRM-Code, a coding-focused PRM that treats functions as reasoning steps using a Chain-of-Function prompting strategy to induce modular code generation, enabling PRM training and application analogous to mathematical reasoning tasks. To address label noise, DreamPRM-Code introduces a meta-learning-based correction mechanism that leverages clean final-solution unit-test labels and performs bi-level optimization to refine intermediate labels. Applying on test-time scaling, DreamPRM-Code achieved state-of-the-art performance on LiveCodeBench with 80.9 pass@1 rate, surpassing OpenAI o4-mini.

Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
Cite as:	arXiv:2512.15000 [cs.LG]
	(or arXiv:2512.15000v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2512.15000

Submission history

From: Ruiyi Zhang [view email]
[v1] Wed, 17 Dec 2025 01:11:35 UTC (158 KB)

Computer Science > Machine Learning

Title:DreamPRM-Code: Function-as-Step Process Reward Model with Label Correction for LLM Coding

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:DreamPRM-Code: Function-as-Step Process Reward Model with Label Correction for LLM Coding

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators