Computer Science > Artificial Intelligence

arXiv:2510.23691 (cs)

[Submitted on 27 Oct 2025]

Title:Game-TARS: Pretrained Foundation Models for Scalable Generalist Multimodal Game Agents

Abstract:We present Game-TARS, a generalist game agent trained with a unified, scalable action space anchored to human-aligned native keyboard-mouse inputs. Unlike API- or GUI-based approaches, this paradigm enables large-scale continual pre-training across heterogeneous domains, including OS, web, and simulation games. Game-TARS is pre-trained on over 500B tokens with diverse trajectories and multimodal data. Key techniques include a decaying continual loss to reduce causal confusion and an efficient Sparse-Thinking strategy that balances reasoning depth and inference cost. Experiments show that Game-TARS achieves about 2 times the success rate over the previous sota model on open-world Minecraft tasks, is close to the generality of fresh humans in unseen web 3d games, and outperforms GPT-5, Gemini-2.5-Pro, and Claude-4-Sonnet in FPS benchmarks. Scaling results on training-time and test-time confirm that the unified action space sustains improvements when scaled to cross-game and multimodal data. Our results demonstrate that simple, scalable action representations combined with large-scale pre-training provide a promising path toward generalist agents with broad computer-use abilities.

Subjects:	Artificial Intelligence (cs.AI)
Cite as:	arXiv:2510.23691 [cs.AI]
	(or arXiv:2510.23691v1 [cs.AI] for this version)
	https://doi.org/10.48550/arXiv.2510.23691

Submission history

From: Zihao Wang [view email]
[v1] Mon, 27 Oct 2025 17:43:51 UTC (23,567 KB)

Computer Science > Artificial Intelligence

Title:Game-TARS: Pretrained Foundation Models for Scalable Generalist Multimodal Game Agents

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Artificial Intelligence

Title:Game-TARS: Pretrained Foundation Models for Scalable Generalist Multimodal Game Agents

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators