Computer Science > Artificial Intelligence

arXiv:2509.07414 (cs)

[Submitted on 9 Sep 2025 (v1), last revised 19 Dec 2025 (this version, v3)]

Title:Language Self-Play For Data-Free Training

Authors:Jakub Grudzien Kuba, Mengting Gu, Qi Ma, Yuandong Tian, Vijai Mohan, Jason Chen

Abstract:Large language models (LLMs) have advanced rapidly in recent years, driven by scale, abundant high-quality training data, and reinforcement learning. Yet this progress faces a fundamental bottleneck: the need for ever more data from which models can continue to learn. In this work, we propose a reinforcement learning approach that removes this dependency by enabling models to improve without additional data. Our method leverages a game-theoretic framework of self-play, where a model's capabilities are cast as performance in a competitive game and stronger policies emerge by having the model play against itself-a process we call Language Self-Play (LSP). Experiments with Llama-3.2-3B-Instruct on instruction-following, mathematics, and coding benchmarks show that pretrained models can be effectively improved with self-play alone.

Subjects:	Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Science and Game Theory (cs.GT)
Cite as:	arXiv:2509.07414 [cs.AI]
	(or arXiv:2509.07414v3 [cs.AI] for this version)
	https://doi.org/10.48550/arXiv.2509.07414

Submission history

From: Jakub Grudzien Kuba [view email]
[v1] Tue, 9 Sep 2025 05:51:34 UTC (812 KB)
[v2] Tue, 16 Dec 2025 10:22:20 UTC (844 KB)
[v3] Fri, 19 Dec 2025 03:05:26 UTC (845 KB)

Computer Science > Artificial Intelligence

Title:Language Self-Play For Data-Free Training

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Artificial Intelligence

Title:Language Self-Play For Data-Free Training

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators