arxiv:2511.22173
Seungone Kim PRO
seungone
AI & ML interests
Large Language Models, LLM-as-a-Judge, Reward Model Overoptimization, Personalized Alignment
Recent Activity
updated a dataset about 12 hours ago
prometheus-eval/peerreview-bench published a dataset 6 days ago
prometheus-eval/peerreview-bench upvoted a paper about 1 month ago
Reasoning over mathematical objects: on-policy reward modeling and test time aggregation