Skip to main content

Showing 1–11 of 11 results for author: Cemri, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2602.23413  [pdf, ps, other

    cs.LG cs.CL cs.NE

    EvoX: Meta-Evolution for Automated Discovery

    Authors: Shu Liu, Shubham Agarwal, Monishwaran Maheswaran, Mert Cemri, Zhifei Li, Qiuyang Mang, Ashwin Naren, Ethan Boneh, Audrey Cheng, Melissa Z. Pan, Alexander Du, Kurt Keutzer, Alvin Cheung, Alexandros G. Dimakis, Koushik Sen, Matei Zaharia, Ion Stoica

    Abstract: Recent work such as AlphaEvolve has shown that combining LLM-driven optimization with evolutionary search can effectively improve programs, prompts, and algorithms across domains. In this paradigm, previously evaluated solutions are reused to guide the model toward new candidate solutions. Crucially, the effectiveness of this evolution process depends on the search strategy: how prior solutions ar… ▽ More

    Submitted 16 March, 2026; v1 submitted 26 February, 2026; originally announced February 2026.

  2. arXiv:2602.20133  [pdf, ps, other

    cs.NE cs.AI cs.CL

    AdaEvolve: Adaptive LLM Driven Zeroth-Order Optimization

    Authors: Mert Cemri, Shubham Agrawal, Akshat Gupta, Shu Liu, Audrey Cheng, Qiuyang Mang, Ashwin Naren, Lutfi Eren Erdogan, Koushik Sen, Matei Zaharia, Alex Dimakis, Ion Stoica

    Abstract: The paradigm of automated program generation is shifting from one-shot generation to inference-time search, where Large Language Models (LLMs) function as semantic mutation operators within evolutionary loops. While effective, these systems are currently governed by static schedules that fail to account for the non-stationary dynamics of the search process. This rigidity results in substantial com… ▽ More

    Submitted 23 February, 2026; originally announced February 2026.

  3. arXiv:2512.14806  [pdf, ps, other

    cs.SE cs.AI

    Let the Barbarians In: How AI Can Accelerate Systems Performance Research

    Authors: Audrey Cheng, Shu Liu, Melissa Pan, Zhifei Li, Shubham Agarwal, Mert Cemri, Bowen Wang, Alexander Krentsel, Tian Xia, Jongseok Park, Shuo Yang, Jeff Chen, Lakshya Agrawal, Ashwin Naren, Shulu Li, Ruiying Ma, Aditya Desai, Jiarong Xing, Koushik Sen, Matei Zaharia, Ion Stoica

    Abstract: Artificial Intelligence (AI) is beginning to transform the research process by automating the discovery of new solutions. This shift depends on the availability of reliable verifiers, which AI-driven approaches require to validate candidate solutions. Research focused on improving systems performance is especially well-suited to this paradigm because system performance problems naturally admit suc… ▽ More

    Submitted 22 December, 2025; v1 submitted 16 December, 2025; originally announced December 2025.

    Comments: arXiv admin note: substantial text overlap with arXiv:2510.06189

  4. arXiv:2511.02755  [pdf, ps, other

    cs.CL

    Controlling Performance and Budget of a Centralized Multi-agent LLM System with Reinforcement Learning

    Authors: Bowen Jin, TJ Collins, Donghan Yu, Mert Cemri, Shenao Zhang, Mengyu Li, Jay Tang, Tian Qin, Zhiyang Xu, Jiarui Lu, Guoli Yin, Jiawei Han, Zirui Wang

    Abstract: Large language models (LLMs) exhibit complementary strengths across domains and come with varying inference costs, motivating the design of multi-agent LLM systems where specialized models collaborate efficiently. Existing approaches predominantly rely on decentralized frameworks, which invoke multiple LLMs for every input and thus lead to substantial and uncontrolled inference costs. In this work… ▽ More

    Submitted 4 November, 2025; originally announced November 2025.

    Comments: 14 pages

  5. arXiv:2510.07043  [pdf, ps, other

    cs.LG

    COMPASS: A Multi-Turn Benchmark for Tool-Mediated Planning & Preference Optimization

    Authors: Tian Qin, Felix Bai, Ting-Yao Hu, Raviteja Vemulapalli, Hema Swetha Koppula, Zhiyang Xu, Bowen Jin, Mert Cemri, Jiarui Lu, Zirui Wang, Meng Cao

    Abstract: Real-world large language model (LLM) agents must master strategic tool use and user preference optimization through multi-turn interactions to assist users with complex planning tasks. We introduce COMPASS (Constrained Optimization through Multi-turn Planning and Strategic Solutions), a benchmark that evaluates agents on realistic travel-planning scenarios. We cast travel planning as a constraine… ▽ More

    Submitted 8 October, 2025; originally announced October 2025.

  6. arXiv:2510.06189  [pdf, ps, other

    cs.AI

    Barbarians at the Gate: How AI is Upending Systems Research

    Authors: Audrey Cheng, Shu Liu, Melissa Pan, Zhifei Li, Bowen Wang, Alex Krentsel, Tian Xia, Mert Cemri, Jongseok Park, Shuo Yang, Jeff Chen, Lakshya Agrawal, Aditya Desai, Jiarong Xing, Koushik Sen, Matei Zaharia, Ion Stoica

    Abstract: Artificial Intelligence (AI) is starting to transform the research process as we know it by automating the discovery of new solutions. Given a task, the typical AI-driven approach is (i) to generate a set of diverse solutions, and then (ii) to verify these solutions and select one that solves the problem. Crucially, this approach assumes the existence of a reliable verifier, i.e., one that can acc… ▽ More

    Submitted 10 October, 2025; v1 submitted 7 October, 2025; originally announced October 2025.

  7. arXiv:2506.15733  [pdf, ps, other

    cs.AI cs.CL cs.LG

    $\texttt{SPECS}$: Faster Test-Time Scaling through Speculative Drafts

    Authors: Mert Cemri, Nived Rajaraman, Rishabh Tiwari, Xiaoxuan Liu, Kurt Keutzer, Ion Stoica, Kannan Ramchandran, Ahmad Beirami, Ziteng Sun

    Abstract: Scaling test-time compute has driven the recent advances in the reasoning capabilities of large language models (LLMs), typically by allocating additional computation for more thorough exploration. However, increased compute often comes at the expense of higher user-facing latency, directly impacting user experience. Current test-time scaling methods primarily optimize for accuracy based on total… ▽ More

    Submitted 18 February, 2026; v1 submitted 15 June, 2025; originally announced June 2025.

    Comments: 28 pages, 6 figures, 2 tables

  8. arXiv:2503.13657  [pdf, ps, other

    cs.AI

    Why Do Multi-Agent LLM Systems Fail?

    Authors: Mert Cemri, Melissa Z. Pan, Shuyi Yang, Lakshya A. Agrawal, Bhavya Chopra, Rishabh Tiwari, Kurt Keutzer, Aditya Parameswaran, Dan Klein, Kannan Ramchandran, Matei Zaharia, Joseph E. Gonzalez, Ion Stoica

    Abstract: Despite enthusiasm for Multi-Agent LLM Systems (MAS), their performance gains on popular benchmarks are often minimal. This gap highlights a critical need for a principled understanding of why MAS fail. Addressing this question requires systematic identification and analysis of failure patterns. We introduce MAST-Data, a comprehensive dataset of 1600+ annotated traces collected across 7 popular MA… ▽ More

    Submitted 26 October, 2025; v1 submitted 17 March, 2025; originally announced March 2025.

    Comments: ArXiv v3

  9. arXiv:2406.11896  [pdf, other

    cs.LG

    DigiRL: Training In-The-Wild Device-Control Agents with Autonomous Reinforcement Learning

    Authors: Hao Bai, Yifei Zhou, Mert Cemri, Jiayi Pan, Alane Suhr, Sergey Levine, Aviral Kumar

    Abstract: Training corpuses for vision language models (VLMs) typically lack sufficient amounts of decision-centric data. This renders off-the-shelf VLMs sub-optimal for decision-making tasks such as in-the-wild device control through graphical user interfaces (GUIs). While training with static demonstrations has shown some promise, we show that such methods fall short for controlling real GUIs due to their… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: 11 pages of main text, 28 pages in total

  10. arXiv:2211.13292  [pdf, other

    cs.SI cs.MA eess.SP

    Discovering Influencers in Opinion Formation over Social Graphs

    Authors: Valentina Shumovskaia, Mert Kayaalp, Mert Cemri, Ali H. Sayed

    Abstract: The adaptive social learning paradigm helps model how networked agents are able to form opinions on a state of nature and track its drifts in a changing environment. In this framework, the agents repeatedly update their beliefs based on private observations and exchange the beliefs with their neighbors. In this work, it is shown how the sequence of publicly exchanged beliefs over time allows users… ▽ More

    Submitted 14 March, 2023; v1 submitted 23 November, 2022; originally announced November 2022.

  11. arXiv:2209.00557  [pdf, other

    cs.CL cs.AI cs.LG

    Unsupervised Simplification of Legal Texts

    Authors: Mert Cemri, Tolga Çukur, Aykut Koç

    Abstract: The processing of legal texts has been developing as an emerging field in natural language processing (NLP). Legal texts contain unique jargon and complex linguistic attributes in vocabulary, semantics, syntax, and morphology. Therefore, the development of text simplification (TS) methods specific to the legal domain is of paramount importance for facilitating comprehension of legal text by ordina… ▽ More

    Submitted 1 September, 2022; originally announced September 2022.