8000
Skip to content
View ElliottYan's full-sized avatar
  • Zhejiang University

Block or report ElliottYan

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Claw-R1: Empowering OpenClaw with Advanced Agentic RL.

Python 163 8 Updated Apr 7, 2026

AgentEvolver: Towards Efficient Self-Evolving Agent System

Python 1,345 153 Updated Apr 1, 2026
Python 24 3 Updated Feb 24, 2026
Python 30 4 Updated Feb 12, 2026
Python 95 13 Updated Mar 31, 2026

P1-VL: Bridging Visual Perception and Scientific Reasoning in Physics Olympiads

15 2 Updated Feb 11, 2026

[ICLR 2026] The official repository for the paper "AdaReasoner: Dynamic Tool Orchestration for Iterative Visual Reasoning".

Jupyter Notebook 79 6 Updated Feb 27, 2026

GRPO training code which scales to 32xH100s for long horizon terminal/coding tasks. Base agent is now the top Qwen3 agent on Stanford's TerminalBench leaderboard.

Python 366 23 Updated Aug 24, 2025

πŸ’» SETA: Scaling Environments for Terminal Agents

Python 88 12 Updated Feb 16, 2026

An End-to-End Infrastructure for Training and Evaluating Various LLM Agents

Python 782 67 Updated Feb 9, 2026

Scaling Agentic Reinforcement Learning with a Multi-Turn, Multi-Task Framework

Python 265 19 Updated Jan 17, 2026

Lightning-Fast RL for LLM Reasoning and Agents. Made Simple & Flexible.

Python 5,000 453 Updated Apr 7, 2026

Scalable toolkit for efficient model reinforcement

Python 1,507 330 Updated Apr 7, 2026

Self-Adapting Language Models

Python 1,734 305 Updated Aug 1, 2025
Jupyter Notebook 214 13 Updated Dec 23, 2025

An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.

Jupyter Notebook 1,964 306 Updated Aug 9, 2025

A Gym for Agentic LLMs

Python 476 31 Updated Jan 21, 2026

[R]einforcement [L]earning from [M]odel-rewarded [T]hinking - code for the paper "Language Models That Think, Chat Better"

Python 127 6 Updated Oct 27, 2025

General Reasoner: Advancing LLM Reasoning Across All Domains [NeurIPS25]

Python 224 14 Updated Nov 27, 2025

Extrapolating RLVR to General Domains without Verifiers

Python 200 11 Updated Aug 12, 2025

πŸŽ‰ TrustJudge is accepted to ICLR 2026!

Python 46 2 Updated Sep 27, 2025

πŸš€ EvoAgentX: Building a Self-Evolving Ecosystem of AI Agents

Python 2,710 227 Updated Apr 7, 2026

Code and Data for Tau-Bench

Python 1,168 189 Updated Mar 18, 2026

rl from zero pretrain, can it be done? yes.

Python 291 21 Updated Sep 28, 2025

Process Consistency Filter: Improve Reasoning Quality for LLM Reinforcement Learning

11 Updated Sep 4, 2025

This is the official github repo for paper "mplicit User Feedback in Human-LLM Dialogues: Informative to Understand Users yet Noisy as a Learning Signal"

2 1 Updated Sep 4, 2025

Interleaving Reasoning: Next-Generation Reasoning Systems for AGI

267 11 Updated Oct 17, 2025

😎 A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, Agent, and Beyond

353 12 Updated Jan 22, 2026

Official repository for ACL 2025 Main Conference Paper "Keys to Robust Edits: From Theoretical Insights to Practical Advances"

Python 3 Updated May 26, 2025
Next
0