User profiles for Zeyi Liao

Zeyi Liao

The Ohio State University
Verified email at osu.edu
Cited by 1051

Eia: Environmental injection attack on generalist web agents for privacy leakage

Z Liao, L Mo, C Xu, M Kang, J Zhang, C Xiao… - arXiv preprint arXiv …, 2024 - arxiv.org
Generalist web agents have demonstrated remarkable potential in autonomously completing
a wide range of tasks on real websites, significantly boosting human productivity. However, …

Amplegcg: Learning a universal and transferable generative model of adversarial suffixes for jailbreaking both open and closed llms

Z Liao, H Sun - arXiv preprint arXiv:2404.07921, 2024 - arxiv.org
As large language models (LLMs) become increasingly prevalent and integrated into
autonomous systems, ensuring their safety is imperative. Despite significant strides toward safety …

Chatcounselor: A large language models for mental health support

JM Liu, D Li, H Cao, T Ren, Z Liao, J Wu - arXiv preprint arXiv:2309.15461, 2023 - arxiv.org
This paper presents ChatCounselor, a large language model (LLM) solution designed to
provide mental health support. Unlike generic chatbots, ChatCounselor is distinguished by its …

AttributionBench: How Hard is Automatic Attribution Evaluation?

Y Li, X Yue, Z Liao, H Sun - Findings of the Association for …, 2024 - aclanthology.org
Modern generative search engines enhance the reliability of large language model (LLM)
responses by providing cited evidence. However, evaluating the answer’s attribution, ie, …

Scienceagentbench: Toward rigorous assessment of language agents for data-driven scientific discovery

…, Y Ning, Q Zhang, B Wang, B Yu, Y Li, Z Liao… - arXiv preprint arXiv …, 2024 - arxiv.org
The advancements of large language models (LLMs) have piqued growing interest in
developing LLM-based language agents to automate scientific discovery end-to-end, which has …

Introducing v0. 5 of the ai safety benchmark from mlcommons

…, SH Kumar, S Kumar, C Lengerich, B Li, Z Liao… - arXiv preprint arXiv …, 2024 - arxiv.org
This paper introduces v0.5 of the AI Safety Benchmark, which has been created by the
MLCommons AI Safety Working Group. The AI Safety Benchmark has been designed to assess …

Agent learning via early experience

K Zhang, X Chen, B Liu, T Xue, Z Liao, Z Liu… - arXiv preprint arXiv …, 2025 - arxiv.org
A long-term goal of language agents is to learn and improve through their own experience,
ultimately outperforming humans in complex, real-world tasks. However, training agents from …

Redteamcua: Realistic adversarial testing of computer-use agents in hybrid web-os environments

Z Liao, J Jones, L Jiang, Y Ning… - arXiv preprint arXiv …, 2025 - arxiv.org
Zeyi Liao provided the benign task formulation used … Zeyi Liao and Linxi Jiang led the main
code implementation for the RedTeamCUA framework and sandbox construction. Zeyi Liao

Mind2web 2: Evaluating agentic search with agent-as-a-judge

…, HN Moussa, T Zhang, J Xie, Y Li, T Xue, Z Liao… - arXiv preprint arXiv …, 2025 - arxiv.org
Agentic search such as Deep Research systems-where agents autonomously browse the
web, synthesize information, and return comprehensive citation-backed answers-represents a …

Advweb: Controllable black-box attacks on vlm-powered web agents

C Xu, M Kang, J Zhang, Z Liao, L Mo, M Yuan, H Sun… - 2024 - openreview.net
Vision Language Models (VLMs) have revolutionized the creation of generalist web agents,
empowering them to autonomously complete diverse tasks on real-world websites, thereby …