-
Graph of Agents: Principled Long Context Modeling by Emergent Multi-Agent Collaboration
Authors:
Taejong Joo,
Shu Ishida,
Ivan Sosnovik,
Bryan Lim,
Sahand Rezaei-Shoshtari,
Adam Gaier,
Robert Giaquinto
Abstract:
As a model-agnostic approach to long context modeling, multi-agent systems can process inputs longer than a large language model's context window without retraining or architectural modifications. However, their performance often heavily relies on hand-crafted multi-agent collaboration strategies and prompt engineering, which limit generalizability. In this work, we introduce a principled framewor…
▽ More
As a model-agnostic approach to long context modeling, multi-agent systems can process inputs longer than a large language model's context window without retraining or architectural modifications. However, their performance often heavily relies on hand-crafted multi-agent collaboration strategies and prompt engineering, which limit generalizability. In this work, we introduce a principled framework that formalizes the model-agnostic long context modeling problem as a compression problem, yielding an information-theoretic compression objective. Building on this framework, we propose Graph of Agents (GoA), which dynamically constructs an input-dependent collaboration structure that maximizes this objective. For Llama 3.1 8B and Qwen3 8B across six document question answering benchmarks, GoA improves the average $F_1$ score of retrieval-augmented generation by 5.7\% and a strong multi-agent baseline using a fixed collaboration structure by 16.35\%, respectively. Even with only a 2K context window, GoA surpasses the 128K context window Llama 3.1 8B on LongBench, showing a dramatic increase in effective context length. Our source code is available at https://github.com/tjoo512/graph-of-agents.
△ Less
Submitted 26 September, 2025;
originally announced September 2025.
-
Multi-lingual Evaluation of Code Generation Models
Authors:
Ben Athiwaratkun,
Sanjay Krishna Gouda,
Zijian Wang,
Xiaopeng Li,
Yuchen Tian,
Ming Tan,
Wasi Uddin Ahmad,
Shiqi Wang,
Qing Sun,
Mingyue Shang,
Sujan Kumar Gonugondla,
Hantian Ding,
Varun Kumar,
Nathan Fulton,
Arash Farahani,
Siddhartha Jain,
Robert Giaquinto,
Haifeng Qian,
Murali Krishna Ramanathan,
Ramesh Nallapati,
Baishakhi Ray,
Parminder Bhatia,
Sudipta Sengupta,
Dan Roth,
Bing Xiang
Abstract:
We present new benchmarks on evaluation code generation models: MBXP and Multilingual HumanEval, and MathQA-X. These datasets cover over 10 programming languages and are generated using a scalable conversion framework that transpiles prompts and test cases from the original Python datasets into the corresponding data in the target language. Using these benchmarks, we are able to assess the perform…
▽ More
We present new benchmarks on evaluation code generation models: MBXP and Multilingual HumanEval, and MathQA-X. These datasets cover over 10 programming languages and are generated using a scalable conversion framework that transpiles prompts and test cases from the original Python datasets into the corresponding data in the target language. Using these benchmarks, we are able to assess the performance of code generation models in a multi-lingual fashion, and discovered generalization ability of language models on out-of-domain languages, advantages of multi-lingual models over mono-lingual, the ability of few-shot prompting to teach the model new languages, and zero-shot translation abilities even on mono-lingual settings. Furthermore, we use our code generation model to perform large-scale bootstrapping to obtain synthetic canonical solutions in several languages, which can be used for other code-related evaluations such as code insertion, robustness, or summarization tasks. Overall, our benchmarks represents a significant step towards a deeper understanding of language models' code generation abilities. We publicly release our code and datasets at https://github.com/amazon-research/mxeval.
△ Less
Submitted 28 March, 2023; v1 submitted 26 October, 2022;
originally announced October 2022.
-
"I Cannot Do All of This Alone": Exploring Instrumental and Prayer Support in Online Health Communities
Authors:
C. Estelle Smith,
Zachary Levonian,
Haiwei Ma,
Robert Giaquinto,
Gemma Lein-Mcdonough,
Zixuan Li,
Susan O'Conner-Von,
Svetlana Yarosh
Abstract:
Online Health Communities (OHCs) are known to provide substantial emotional and informational support to patients and family caregivers facing life-threatening diagnoses like cancer and other illnesses, injuries, or chronic conditions. Yet little work explores how OHCs facilitate other vital forms of social support, especially instrumental support. We partner with CaringBridge.org---a prominent OH…
▽ More
Online Health Communities (OHCs) are known to provide substantial emotional and informational support to patients and family caregivers facing life-threatening diagnoses like cancer and other illnesses, injuries, or chronic conditions. Yet little work explores how OHCs facilitate other vital forms of social support, especially instrumental support. We partner with CaringBridge.org---a prominent OHC for journaling about health crises---to complete a two-phase study focused on instrumental support. Phase one involves a content analysis of 641 CaringBridge updates. Phase two is a survey of 991 CaringBridge users. Results show that patients and family caregivers diverge from their support networks in their preferences for specific instrumental support types. Furthermore, ``prayer support'' emerged as the most prominent support category across both phases. We discuss design implications to accommodate divergent preferences and to expand the instrumental support network. We also discuss the need for future work to empower family caregivers and to support spirituality.
△ Less
Submitted 24 May, 2020;
originally announced May 2020.
-
Gradient Boosted Normalizing Flows
Authors:
Robert Giaquinto,
Arindam Banerjee
Abstract:
By chaining a sequence of differentiable invertible transformations, normalizing flows (NF) provide an expressive method of posterior approximation, exact density evaluation, and sampling. The trend in normalizing flow literature has been to devise deeper, more complex transformations to achieve greater flexibility. We propose an alternative: Gradient Boosted Normalizing Flows (GBNF) model a densi…
▽ More
By chaining a sequence of differentiable invertible transformations, normalizing flows (NF) provide an expressive method of posterior approximation, exact density evaluation, and sampling. The trend in normalizing flow literature has been to devise deeper, more complex transformations to achieve greater flexibility. We propose an alternative: Gradient Boosted Normalizing Flows (GBNF) model a density by successively adding new NF components with gradient boosting. Under the boosting framework, each new NF component optimizes a sample weighted likelihood objective, resulting in new components that are fit to the residuals of the previously trained components. The GBNF formulation results in a mixture model structure, whose flexibility increases as more components are added. Moreover, GBNFs offer a wider, as opposed to strictly deeper, approach that improves existing NFs at the cost of additional training---not more complex transformations. We demonstrate the effectiveness of this technique for density estimation and, by coupling GBNF with a variational autoencoder, generative modeling of images. Our results show that GBNFs outperform their non-boosted analog, and, in some cases, produce better results with smaller, simpler flows.
△ Less
Submitted 17 October, 2020; v1 submitted 26 February, 2020;
originally announced February 2020.
-
DAPPER: Scaling Dynamic Author Persona Topic Model to Billion Word Corpora
Authors:
Robert Giaquinto,
Arindam Banerjee
Abstract:
Extracting common narratives from multi-author dynamic text corpora requires complex models, such as the Dynamic Author Persona (DAP) topic model. However, such models are complex and can struggle to scale to large corpora, often because of challenging non-conjugate terms. To overcome such challenges, in this paper we adapt new ideas in approximate inference to the DAP model, resulting in the DAP…
▽ More
Extracting common narratives from multi-author dynamic text corpora requires complex models, such as the Dynamic Author Persona (DAP) topic model. However, such models are complex and can struggle to scale to large corpora, often because of challenging non-conjugate terms. To overcome such challenges, in this paper we adapt new ideas in approximate inference to the DAP model, resulting in the DAP Performed Exceedingly Rapidly (DAPPER) topic model. Specifically, we develop Conjugate-Computation Variational Inference (CVI) based variational Expectation-Maximization (EM) for learning the model, yielding fast, closed form updates for each document, replacing iterative optimization in earlier work. Our results show significant improvements in model fit and training time without needing to compromise the model's temporal structure or the application of Regularized Variation Inference (RVI). We demonstrate the scalability and effectiveness of the DAPPER model by extracting health journeys from the CaringBridge corpus --- a collection of 9 million journals written by 200,000 authors during health crises.
△ Less
Submitted 3 November, 2018;
originally announced November 2018.
-
Topic Modeling on Health Journals with Regularized Variational Inference
Authors:
Robert Giaquinto,
Arindam Banerjee
Abstract:
Topic modeling enables exploration and compact representation of a corpus. The CaringBridge (CB) dataset is a massive collection of journals written by patients and caregivers during a health crisis. Topic modeling on the CB dataset, however, is challenging due to the asynchronous nature of multiple authors writing about their health journeys. To overcome this challenge we introduce the Dynamic Au…
▽ More
Topic modeling enables exploration and compact representation of a corpus. The CaringBridge (CB) dataset is a massive collection of journals written by patients and caregivers during a health crisis. Topic modeling on the CB dataset, however, is challenging due to the asynchronous nature of multiple authors writing about their health journeys. To overcome this challenge we introduce the Dynamic Author-Persona topic model (DAP), a probabilistic graphical model designed for temporal corpora with multiple authors. The novelty of the DAP model lies in its representation of authors by a persona --- where personas capture the propensity to write about certain topics over time. Further, we present a regularized variational inference algorithm, which we use to encourage the DAP model's personas to be distinct. Our results show significant improvements over competing topic models --- particularly after regularization, and highlight the DAP model's unique ability to capture common journeys shared by different authors.
△ Less
Submitted 15 January, 2018;
originally announced January 2018.