- Hong Kong
-
23:39
(UTC +08:00) - https://scholar.google.com/citations?user=vZPl_oQAAAAJ&hl=en
Lists (11)
Sort Name ascending (A-Z)
Stars
Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends
tmlr-group / Co-rewarding
Forked from resistzzz/Co-rewarding[arXiv:2508.00410] "Co-rewarding: Stable Self-supervised RL for Eliciting Reasoning in Large Language Models"
verl: Volcano Engine Reinforcement Learning for LLMs
[ICML 2025] "From Passive to Active Reasoning: Can Large Language Models Ask the Right Questions under Incomplete Information?"
[ICML 2025] "From Debate to Equilibrium: Belief-Driven Multi-Agent LLM Reasoning via Bayesian Nash Equilibrium"
Python tool for converting files and office documents to Markdown.
An easy-to-use Python framework to generate adversarial jailbreak prompts.
A framework for the evaluation of autoregressive code generation language models.
[ICLR 2025 Workshop] "Landscape of Thoughts: Visualizing the Reasoning Process of Large Language Models"
paper list, dataset, and tools for radiology report generation
[ACL 25] SafeChain: Safety of Language Models with Long Chain-of-Thought Reasoning Capabilities
Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation
The Google Scholar PDF Reader browser extension, now with annotations!
[NeurIPS 2023] Combating Bilateral Edge Noise for Robust Link Prediction
[arXiv:2411.10023] "Model Inversion Attacks: A Survey of Approaches and Countermeasures"
A curated list of resources for activation engineering
Fully open data curation for reasoning models
A reading list on LLM based Synthetic Data Generation 🔥
Recipes to train reward model for RLHF.
GenRM-CoT: Data release for verification rationales
A bibliography and survey of the papers surrounding o1
[NeurIPS 2024] "Can Language Models Perform Robust Reasoning in Chain-of-thought Prompting with Noisy Rationales?"
Train transformer language models with reinforcement learning.
[TMI 2024] "SPIRiT-Diffusion: Self-Consistency Driven Diffusion Model for Accelerated MRI"
tmlr-group / CoPA
Forked from HongduanTian/CoPA[NeurIPS 2024] "Mind the Gap between Prototypes and Images in Cross-domain Finetuning"