Skip to main content

Showing 1–50 of 101 results for author: Sontag, D

Searching in archive cs. Search in all archives.
.
  1. arXiv:2603.11679  [pdf, ps, other

    cs.AI

    LLMs can construct powerful representations and streamline sample-efficient supervised learning

    Authors: Ilker Demirel, Lawrence Shi, Zeshan Hussain, David Sontag

    Abstract: As real-world datasets become increasingly complex and heterogeneous, supervised learning is often bottlenecked by input representation design. Modeling multimodal data for downstream tasks, such as time-series, free text, and structured records, often requires non-trivial domain-specific engineering. We propose an agentic pipeline to streamline this process. First, an LLM analyzes a small but div… ▽ More

    Submitted 21 March, 2026; v1 submitted 12 March, 2026; originally announced March 2026.

  2. arXiv:2511.19399  [pdf, ps, other

    cs.CL cs.AI cs.LG

    DR Tulu: Reinforcement Learning with Evolving Rubrics for Deep Research

    Authors: Rulin Shao, Akari Asai, Shannon Zejiang Shen, Hamish Ivison, Varsha Kishore, Jingming Zhuo, Xinran Zhao, Molly Park, Samuel G. Finlayson, David Sontag, Tyler Murray, Sewon Min, Pradeep Dasigi, Luca Soldaini, Faeze Brahman, Wen-tau Yih, Tongshuang Wu, Luke Zettlemoyer, Yoon Kim, Hannaneh Hajishirzi, Pang Wei Koh

    Abstract: Deep research models perform multi-step research to produce long-form, well-attributed answers. However, most open deep research models are trained on easily verifiable short-form QA tasks via reinforcement learning with verifiable rewards (RLVR), which does not extend to realistic long-form tasks. We address this with Reinforcement Learning with Evolving Rubrics (RLER), in which we construct and… ▽ More

    Submitted 26 November, 2025; v1 submitted 24 November, 2025; originally announced November 2025.

  3. arXiv:2511.04807  [pdf, ps, other

    cs.LG math.DS

    Autoencoding Dynamics: Topological Limitations and Capabilities

    Authors: Matthew D. Kvalheim, Eduardo D. Sontag

    Abstract: Given a "data manifold" $M\subset \mathbb{R}^n$ and "latent space" $\mathbb{R}^\ell$, an autoencoder is a pair of continuous maps consisting of an "encoder" $E\colon \mathbb{R}^n\to \mathbb{R}^\ell$ and "decoder" $D\colon \mathbb{R}^\ell\to \mathbb{R}^n$ such that the "round trip" map $D\circ E$ is as close as possible to the identity map $\mbox{id}_M$ on $M$. We present various topological limita… ▽ More

    Submitted 10 November, 2025; v1 submitted 6 November, 2025; originally announced November 2025.

  4. arXiv:2510.25744  [pdf

    cs.CL cs.AI

    Completion $\neq$ Collaboration: Scaling Collaborative Effort with Agents

    Authors: Shannon Zejiang Shen, Valerie Chen, Ken Gu, Alexis Ross, Zixian Ma, Jillian Ross, Alex Gu, Chenglei Si, Wayne Chi, Andi Peng, Jocelyn J Shen, Ameet Talwalkar, Tongshuang Wu, David Sontag

    Abstract: Current evaluations of agents remain centered around one-shot task completion, failing to account for the inherently iterative and collaborative nature of many real-world problems, where human goals are often underspecified and evolve. We argue for a shift from building and assessing task completion agents to developing collaborative agents, assessed not only by the quality of their final outputs… ▽ More

    Submitted 30 October, 2025; v1 submitted 29 October, 2025; originally announced October 2025.

    Comments: 22 pages, 5 figures, 3 tables

  5. arXiv:2509.19601  [pdf, ps, other

    cs.LG eess.SY

    Learning Genetic Circuit Modules with Neural Networks: Full Version

    Authors: Jichi Wang, Eduardo D. Sontag, Domitilla Del Vecchio

    Abstract: In several applications, including in synthetic biology, one often has input/output data on a system composed of many modules, and although the modules' input/output functions and signals may be unknown, knowledge of the composition architecture can significantly reduce the amount of training data required to learn the system's input/output mapping. Learning the modules' input/output functions is… ▽ More

    Submitted 29 March, 2026; v1 submitted 23 September, 2025; originally announced September 2025.

  6. arXiv:2507.10452  [pdf, ps, other

    cs.LG

    Some remarks on gradient dominance and LQR policy optimization

    Authors: Eduardo D. Sontag

    Abstract: Solutions of optimization problems, including policy optimization in reinforcement learning, typically rely upon some variant of gradient descent. There has been much recent work in the machine learning, control, and optimization communities applying the Polyak-Łojasiewicz Inequality (PLI) to such problems in order to establish an exponential rate of convergence (a.k.a. ``linear convergence'' in t… ▽ More

    Submitted 15 July, 2025; v1 submitted 14 July, 2025; originally announced July 2025.

    Comments: This is a short paper summarizing the first part of the slides presented at my keynote at the 2025 L4DC (Learning for Dynamics & Control Conference) in Ann Arbor, Michigan, 05 June 2025. A partial bibliography has been added

  7. arXiv:2505.15024  [pdf, other

    cs.CL

    Diagnosing our datasets: How does my language model learn clinical information?

    Authors: Furong Jia, David Sontag, Monica Agrawal

    Abstract: Large language models (LLMs) have performed well across various clinical natural language processing tasks, despite not being directly trained on electronic health record (EHR) data. In this work, we examine how popular open-source LLMs learn clinical information from large mined corpora through two crucial but understudied lenses: (1) their interpretation of clinical jargon, a foundational abilit… ▽ More

    Submitted 22 May, 2025; v1 submitted 20 May, 2025; originally announced May 2025.

  8. arXiv:2503.23641  [pdf, other

    math.OC cs.AI eess.SY

    Remarks on the Polyak-Lojasiewicz inequality and the convergence of gradient systems

    Authors: Arthur Castello B. de Oliveira, Leilei Cui, Eduardo D. Sontag

    Abstract: This work explores generalizations of the Polyak-Lojasiewicz inequality (PLI) and their implications for the convergence behavior of gradient flows in optimization problems. Motivated by the continuous-time linear quadratic regulator (CT-LQR) policy optimization problem -- where only a weaker version of the PLI is characterized in the literature -- this work shows that while weaker conditions are… ▽ More

    Submitted 30 March, 2025; originally announced March 2025.

  9. arXiv:2503.14724  [pdf, other

    cs.HC

    CodingGenie: A Proactive LLM-Powered Programming Assistant

    Authors: Sebastian Zhao, Alan Zhu, Hussein Mozannar, David Sontag, Ameet Talwalkar, Valerie Chen

    Abstract: While developers increasingly adopt tools powered by large language models (LLMs) in day-to-day workflows, these tools still require explicit user invocation. To seamlessly integrate LLM capabilities to a developer's workflow, we introduce CodingGenie, a proactive assistant integrated into the code editor. CodingGenie autonomously provides suggestions, ranging from bug fixing to unit testing, base… ▽ More

    Submitted 18 March, 2025; originally announced March 2025.

    Comments: FSE Demo 2025

  10. arXiv:2502.17403  [pdf, ps, other

    cs.LG cs.AI cs.CL

    Large Language Models are Powerful Electronic Health Record Encoders

    Authors: Stefan Hegselmann, Georg von Arnim, Tillmann Rheude, Noel Kronenberg, David Sontag, Gerhard Hindricks, Roland Eils, Benjamin Wild

    Abstract: Electronic Health Records (EHRs) offer considerable potential for clinical prediction, but their complexity and heterogeneity present significant challenges for traditional machine learning methods. Recently, domain-specific EHR foundation models trained on large volumes of unlabeled EHR data have shown improved predictive accuracy and generalization. However, their development is constrained by l… ▽ More

    Submitted 19 October, 2025; v1 submitted 24 February, 2025; originally announced February 2025.

  11. arXiv:2410.04596  [pdf, other

    cs.HC

    Need Help? Designing Proactive AI Assistants for Programming

    Authors: Valerie Chen, Alan Zhu, Sebastian Zhao, Hussein Mozannar, David Sontag, Ameet Talwalkar

    Abstract: While current chat-based AI assistants primarily operate reactively, responding only when prompted by users, there is significant potential for these systems to proactively assist in tasks without explicit invocation, enabling a mixed-initiative interaction. This work explores the design and implementation of proactive AI assistants powered by large language models. We first outline the key design… ▽ More

    Submitted 28 February, 2025; v1 submitted 6 October, 2024; originally announced October 2024.

    Comments: CHI 2025

  12. arXiv:2409.00276  [pdf, other

    math.OC cs.CR cs.LG eess.SY

    Exact Recovery Guarantees for Parameterized Nonlinear System Identification Problem under Sparse Disturbances or Semi-Oblivious Attacks

    Authors: Haixiang Zhang, Baturalp Yalcin, Javad Lavaei, Eduardo D. Sontag

    Abstract: In this work, we study the problem of learning a nonlinear dynamical system by parameterizing its dynamics using basis functions. We assume that disturbances occur at each time step with an arbitrary probability $p$, which models the sparsity level of the disturbance vectors over time. These disturbances are drawn from an arbitrary, unknown probability distribution, which may depend on past distur… ▽ More

    Submitted 20 March, 2025; v1 submitted 30 August, 2024; originally announced September 2024.

    Comments: 43 pages

    MSC Class: 62; 90; 93

  13. arXiv:2407.09642  [pdf, other

    cs.LG

    Seq-to-Final: A Benchmark for Tuning from Sequential Distributions to a Final Time Point

    Authors: Christina X Ji, Ahmed M Alaa, David Sontag

    Abstract: Distribution shift over time occurs in many settings. Leveraging historical data is necessary to learn a model for the last time point when limited data is available in the final period, yet few methods have been developed specifically for this purpose. In this work, we construct a benchmark with different sequences of synthetic shifts to evaluate the effectiveness of 3 classes of methods that 1)… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

  14. arXiv:2406.02873  [pdf, other

    stat.ML cs.LG

    Prediction-powered Generalization of Causal Inferences

    Authors: Ilker Demirel, Ahmed Alaa, Anthony Philippakis, David Sontag

    Abstract: Causal inferences from a randomized controlled trial (RCT) may not pertain to a target population where some effect modifiers have a different distribution. Prior work studies generalizing the results of a trial to a target population with no outcome but covariate data available. We show how the limited size of trials makes generalization a statistically infeasible task, as it requires estimating… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

    Comments: International Conference on Machine Learning (ICML), 2024

  15. arXiv:2405.16043  [pdf, other

    cs.LG cs.CL stat.ML

    Theoretical Analysis of Weak-to-Strong Generalization

    Authors: Hunter Lang, David Sontag, Aravindan Vijayaraghavan

    Abstract: Strong student models can learn from weaker teachers: when trained on the predictions of a weaker model, a strong pretrained student can learn to correct the weak model's errors and generalize to examples where the teacher is not confident, even when these examples are excluded from training. This enables learning from cheap, incomplete, and possibly incorrect label information, such as coarse log… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

    Comments: 36 pages, 3 figures

  16. arXiv:2404.15187  [pdf

    cs.HC

    Evaluating Physician-AI Interaction for Cancer Management: Paving the Path towards Precision Oncology

    Authors: Zeshan Hussain, Barbara D. Lam, Fernando A. Acosta-Perez, Irbaz Bin Riaz, Maia Jacobs, Andrew J. Yee, David Sontag

    Abstract: We evaluated how clinicians approach clinical decision-making when given findings from both randomized controlled trials (RCTs) and machine learning (ML) models. To do so, we designed a clinical decision support system (CDSS) that displays survival curves and adverse event information from a synthetic RCT and ML model for 12 patients with multiple myeloma. We conducted an interventional study in a… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

    Comments: First two listed authors are co-first authors

  17. arXiv:2404.02806  [pdf, other

    cs.SE cs.AI cs.HC

    The RealHumanEval: Evaluating Large Language Models' Abilities to Support Programmers

    Authors: Hussein Mozannar, Valerie Chen, Mohammed Alsobay, Subhro Das, Sebastian Zhao, Dennis Wei, Manish Nagireddy, Prasanna Sattigeri, Ameet Talwalkar, David Sontag

    Abstract: Evaluation of large language models for code has primarily relied on static benchmarks, including HumanEval (Chen et al., 2021), or more recently using human preferences of LLM responses. As LLMs are increasingly used as programmer assistants, we study whether gains on existing benchmarks or more preferred LLM responses translate to programmer productivity when coding with LLMs, including time spe… ▽ More

    Submitted 14 October, 2024; v1 submitted 3 April, 2024; originally announced April 2024.

  18. arXiv:2403.03870  [pdf, other

    cs.CL cs.LG

    Learning to Decode Collaboratively with Multiple Language Models

    Authors: Shannon Zejiang Shen, Hunter Lang, Bailin Wang, Yoon Kim, David Sontag

    Abstract: We propose a method to teach multiple large language models (LLM) to collaborate by interleaving their generations at the token level. We model the decision of which LLM generates the next token as a latent variable. By optimizing the marginal likelihood of a training set under our latent variable model, the base LLM automatically learns when to generate itself and when to call on one of the ``ass… ▽ More

    Submitted 27 August, 2024; v1 submitted 6 March, 2024; originally announced March 2024.

    Comments: 16 pages, 4 figures, 11 tables

  19. arXiv:2403.00177  [pdf, other

    cs.LG q-bio.QM

    Med-Real2Sim: Non-Invasive Medical Digital Twins using Physics-Informed Self-Supervised Learning

    Authors: Keying Kuang, Frances Dean, Jack B. Jedlicki, David Ouyang, Anthony Philippakis, David Sontag, Ahmed M. Alaa

    Abstract: A digital twin is a virtual replica of a real-world physical phenomena that uses mathematical modeling to characterize and simulate its defining features. By constructing digital twins for disease processes, we can perform in-silico simulations that mimic patients' health conditions and counterfactual outcomes under hypothetical interventions in a virtual setting. This eliminates the need for inva… ▽ More

    Submitted 31 October, 2024; v1 submitted 29 February, 2024; originally announced March 2024.

  20. arXiv:2402.15422  [pdf, other

    cs.CL cs.AI cs.LG

    A Data-Centric Approach To Generate Faithful and High Quality Patient Summaries with Large Language Models

    Authors: Stefan Hegselmann, Shannon Zejiang Shen, Florian Gierse, Monica Agrawal, David Sontag, Xiaoyi Jiang

    Abstract: Patients often face difficulties in understanding their hospitalizations, while healthcare workers have limited resources to provide explanations. In this work, we investigate the potential of large language models to generate patient summaries based on doctors' notes and study the effect of training data on the faithfulness and quality of the generated summaries. To this end, we release (i) a rig… ▽ More

    Submitted 25 June, 2024; v1 submitted 23 February, 2024; originally announced February 2024.

  21. arXiv:2401.09637  [pdf, other

    cs.HC cs.AI cs.CL

    Impact of Large Language Model Assistance on Patients Reading Clinical Notes: A Mixed-Methods Study

    Authors: Niklas Mannhardt, Elizabeth Bondi-Kelly, Barbara Lam, Hussein Mozannar, Chloe O'Connell, Mercy Asiedu, Alejandro Buendia, Tatiana Urman, Irbaz B. Riaz, Catherine E. Ricciardi, Monica Agrawal, Marzyeh Ghassemi, David Sontag

    Abstract: Large language models (LLMs) have immense potential to make information more accessible, particularly in medicine, where complex medical jargon can hinder patient comprehension of clinical notes. We developed a patient-facing tool using LLMs to make clinical notes more readable by simplifying, extracting information from, and adding context to the notes. We piloted the tool with clinical notes don… ▽ More

    Submitted 14 October, 2024; v1 submitted 17 January, 2024; originally announced January 2024.

  22. arXiv:2311.09188  [pdf, other

    cs.CL cs.AI cs.LG

    Towards Verifiable Text Generation with Symbolic References

    Authors: Lucas Torroba Hennigen, Shannon Shen, Aniruddha Nrusimha, Bernhard Gapp, David Sontag, Yoon Kim

    Abstract: LLMs are vulnerable to hallucinations, and thus their outputs generally require laborious human verification for high-stakes applications. To this end, we propose symbolically grounded generation (SymGen) as a simple approach for enabling easier manual validation of an LLM's output. SymGen prompts an LLM to interleave its regular output text with explicit symbolic references to fields present in s… ▽ More

    Submitted 15 April, 2024; v1 submitted 15 November, 2023; originally announced November 2023.

    Comments: 57 pages, 8 figures, 8 tables

  23. arXiv:2311.01007  [pdf, other

    cs.LG cs.AI cs.HC

    Effective Human-AI Teams via Learned Natural Language Rules and Onboarding

    Authors: Hussein Mozannar, Jimin J Lee, Dennis Wei, Prasanna Sattigeri, Subhro Das, David Sontag

    Abstract: People are relying on AI agents to assist them with various tasks. The human must know when to rely on the agent, collaborate with the agent, or ignore its suggestions. In this work, we propose to learn rules, grounded in data regions and described in natural language, that illustrate how the human should collaborate with the AI. Our novel region discovery algorithm finds local regions in the data… ▽ More

    Submitted 7 November, 2023; v1 submitted 2 November, 2023; originally announced November 2023.

    Comments: NeurIPS 2023 Spotlight

  24. arXiv:2310.02250  [pdf, other

    cs.LG

    Why should autoencoders work?

    Authors: Matthew D. Kvalheim, Eduardo D. Sontag

    Abstract: Deep neural network autoencoders are routinely used computationally for model reduction. They allow recognizing the intrinsic dimension of data that lie in a $k$-dimensional subset $K$ of an input Euclidean space $\mathbb{R}^n$. The underlying idea is to obtain both an encoding layer that maps $\mathbb{R}^n$ into $\mathbb{R}^k$ (called the bottleneck layer or the space of latent variables) and a d… ▽ More

    Submitted 17 February, 2024; v1 submitted 3 October, 2023; originally announced October 2023.

    Comments: 24 pages, 9 figures; version 3 is accepted for publication in Transactions on Machine Learning Research (TMLR)

  25. arXiv:2308.08494  [pdf, other

    cs.IR cs.CL cs.LG

    Conceptualizing Machine Learning for Dynamic Information Retrieval of Electronic Health Record Notes

    Authors: Sharon Jiang, Shannon Shen, Monica Agrawal, Barbara Lam, Nicholas Kurtzman, Steven Horng, David Karger, David Sontag

    Abstract: The large amount of time clinicians spend sifting through patient notes and documenting in electronic health records (EHRs) is a leading cause of clinician burnout. By proactively and dynamically retrieving relevant notes during the documentation process, we can reduce the effort required to find relevant patient history. In this work, we conceptualize the use of EHR audit logs for machine learnin… ▽ More

    Submitted 9 August, 2023; originally announced August 2023.

    Comments: To be published in Proceedings of Machine Learning Research Volume 219; accepted to the Machine Learning for Healthcare 2023 conference

  26. arXiv:2305.17261  [pdf, other

    cs.LG cs.HC

    Closing the Gap in High-Risk Pregnancy Care Using Machine Learning and Human-AI Collaboration

    Authors: Hussein Mozannar, Yuria Utsumi, Irene Y. Chen, Stephanie S. Gervasi, Michele Ewing, Aaron Smith-McLallen, David Sontag

    Abstract: A high-risk pregnancy is a pregnancy complicated by factors that can adversely affect the outcomes of the mother or the infant. Health insurers use algorithms to identify members who would benefit from additional clinical support. This work presents the implementation of a real-world ML-based system to assist care managers in identifying pregnant patients at risk of complications. In this retrospe… ▽ More

    Submitted 22 April, 2024; v1 submitted 26 May, 2023; originally announced May 2023.

  27. arXiv:2305.09904  [pdf, ps, other

    cs.LG eess.SY

    On the ISS Property of the Gradient Flow for Single Hidden-Layer Neural Networks with Linear Activations

    Authors: Arthur Castello B. de Oliveira, Milad Siami, Eduardo D. Sontag

    Abstract: Recent research in neural networks and machine learning suggests that using many more parameters than strictly required by the initial complexity of a regression problem can result in more accurate or faster-converging models -- contrary to classical statistical belief. This phenomenon, sometimes known as ``benign overfitting'', raises questions regarding in what other ways might overparameterizat… ▽ More

    Submitted 16 May, 2023; originally announced May 2023.

    Comments: 10 pages, 1 figure, extended conference version

  28. arXiv:2305.05087  [pdf, other

    cs.LG

    Large-Scale Study of Temporal Shift in Health Insurance Claims

    Authors: Christina X Ji, Ahmed M Alaa, David Sontag

    Abstract: Most machine learning models for predicting clinical outcomes are developed using historical data. Yet, even if these models are deployed in the near future, dataset shift over time may result in less than ideal performance. To capture this phenomenon, we consider a task--that is, an outcome to be predicted at a particular time point--to be non-stationary if a historical model is no longer optimal… ▽ More

    Submitted 18 June, 2023; v1 submitted 8 May, 2023; originally announced May 2023.

    Comments: To appear as an oral spotlight and poster at Conference on Health, Inference, and Learning (CHIL) 2023

  29. arXiv:2304.02623  [pdf, other

    cs.CL cs.HC

    Beyond Summarization: Designing AI Support for Real-World Expository Writing Tasks

    Authors: Zejiang Shen, Tal August, Pao Siangliulue, Kyle Lo, Jonathan Bragg, Jeff Hammerbacher, Doug Downey, Joseph Chee Chang, David Sontag

    Abstract: Large language models have introduced exciting new opportunities and challenges in designing and developing new AI-assisted writing support tools. Recent work has shown that leveraging this new technology can transform writing in many scenarios such as ideation during creative writing, editing support, and summarization. However, AI-supported expository writing--including real-world tasks like sch… ▽ More

    Submitted 5 April, 2023; originally announced April 2023.

    Comments: 3 pages, 1 figure, accepted by The Second Workshop on Intelligent and Interactive Writing Assistants

  30. arXiv:2304.01426  [pdf, other

    cs.LG stat.ME

    Conformalized Unconditional Quantile Regression

    Authors: Ahmed M. Alaa, Zeshan Hussain, David Sontag

    Abstract: We develop a predictive inference procedure that combines conformal prediction (CP) with unconditional quantile regression (QR) -- a commonly used tool in econometrics that involves regressing the recentered influence function (RIF) of the quantile functional over input covariates. Unlike the more widely-known conditional QR, unconditional QR explicitly captures the impact of changes in covariate… ▽ More

    Submitted 3 April, 2023; originally announced April 2023.

  31. arXiv:2301.13133  [pdf, other

    stat.ME cs.LG

    Falsification of Internal and External Validity in Observational Studies via Conditional Moment Restrictions

    Authors: Zeshan Hussain, Ming-Chieh Shih, Michael Oberst, Ilker Demirel, David Sontag

    Abstract: Randomized Controlled Trials (RCT)s are relied upon to assess new treatments, but suffer from limited power to guide personalized treatment decisions. On the other hand, observational (i.e., non-experimental) studies have large and diverse populations, but are prone to various biases (e.g. residual confounding). To safely leverage the strengths of observational studies, we focus on the problem of… ▽ More

    Submitted 6 March, 2023; v1 submitted 30 January, 2023; originally announced January 2023.

    Comments: Artificial Intelligence and Statistics 2023

  32. arXiv:2301.06197  [pdf, other

    cs.LG cs.HC

    Who Should Predict? Exact Algorithms For Learning to Defer to Humans

    Authors: Hussein Mozannar, Hunter Lang, Dennis Wei, Prasanna Sattigeri, Subhro Das, David Sontag

    Abstract: Automated AI classifiers should be able to defer the prediction to a human decision maker to ensure more accurate predictions. In this work, we jointly train a classifier with a rejector, which decides on each data point whether the classifier or the human should predict. We show that prior approaches can fail to find a human-AI system with low misclassification error even when there exists a line… ▽ More

    Submitted 11 April, 2023; v1 submitted 15 January, 2023; originally announced January 2023.

    Comments: AISTATS 2023

  33. arXiv:2210.10723  [pdf, other

    cs.CL cs.AI

    TabLLM: Few-shot Classification of Tabular Data with Large Language Models

    Authors: Stefan Hegselmann, Alejandro Buendia, Hunter Lang, Monica Agrawal, Xiaoyi Jiang, David Sontag

    Abstract: We study the application of large language models to zero-shot and few-shot classification of tabular data. We prompt the large language model with a serialization of the tabular data to a natural-language string, together with a short description of the classification problem. In the few-shot setting, we fine-tune the large language model using some labeled examples. We evaluate several serializa… ▽ More

    Submitted 17 March, 2023; v1 submitted 19 October, 2022; originally announced October 2022.

  34. arXiv:2209.13708  [pdf, other

    cs.LG

    Falsification before Extrapolation in Causal Effect Estimation

    Authors: Zeshan Hussain, Michael Oberst, Ming-Chieh Shih, David Sontag

    Abstract: Randomized Controlled Trials (RCTs) represent a gold standard when developing policy guidelines. However, RCTs are often narrow, and lack data on broader populations of interest. Causal effects in these populations are often estimated using observational datasets, which may suffer from unobserved confounding and selection bias. Given a set of observational estimates (e.g. from multiple studies), w… ▽ More

    Submitted 6 March, 2023; v1 submitted 27 September, 2022; originally announced September 2022.

    Comments: Conference on Neural Information Processing Systems, 2022

  35. arXiv:2207.09584  [pdf, other

    cs.LG cs.AI cs.HC

    Sample Efficient Learning of Predictors that Complement Humans

    Authors: Mohammad-Amin Charusaie, Hussein Mozannar, David Sontag, Samira Samadi

    Abstract: One of the goals of learning algorithms is to complement and reduce the burden on human decision makers. The expert deferral setting wherein an algorithm can either predict on its own or defer the decision to a downstream expert helps accomplish this goal. A fundamental aspect of this setting is the need to learn complementary predictors that improve on the human's weaknesses rather than learning… ▽ More

    Submitted 19 July, 2022; originally announced July 2022.

    Comments: ICML 2022

  36. arXiv:2206.02914  [pdf, other

    stat.ML cs.AI cs.LG

    Training Subset Selection for Weak Supervision

    Authors: Hunter Lang, Aravindan Vijayaraghavan, David Sontag

    Abstract: Existing weak supervision approaches use all the data covered by weak signals to train a classifier. We show both theoretically and empirically that this is not always optimal. Intuitively, there is a tradeoff between the amount of weakly-labeled data and the precision of the weak labels. We explore this tradeoff by combining pretrained data representations with the cut statistic (Muhlenbach et al… ▽ More

    Submitted 6 March, 2023; v1 submitted 6 June, 2022; originally announced June 2022.

    Comments: NeurIPS 2022

  37. arXiv:2205.15947  [pdf, other

    cs.LG stat.ML

    Evaluating Robustness to Dataset Shift via Parametric Robustness Sets

    Authors: Nikolaj Thams, Michael Oberst, David Sontag

    Abstract: We give a method for proactively identifying small, plausible shifts in distribution which lead to large differences in model performance. These shifts are defined via parametric changes in the causal mechanisms of observed variables, where constraints on parameters yield a "robustness set" of plausible distributions and a corresponding worst-case loss over the set. While the loss under an individ… ▽ More

    Submitted 15 January, 2023; v1 submitted 31 May, 2022; originally announced May 2022.

    Comments: NeurIPS 2022; Equal Contribution by Nikolaj/Michael, order determined by coin flip

  38. arXiv:2205.12689  [pdf, other

    cs.CL cs.AI

    Large Language Models are Few-Shot Clinical Information Extractors

    Authors: Monica Agrawal, Stefan Hegselmann, Hunter Lang, Yoon Kim, David Sontag

    Abstract: A long-running goal of the clinical NLP community is the extraction of important variables trapped in clinical notes. However, roadblocks have included dataset shift from the general domain and a lack of public clinical corpora and annotations. In this work, we show that large language models, such as InstructGPT, perform well at zero- and few-shot information extraction from clinical text despite… ▽ More

    Submitted 30 November, 2022; v1 submitted 25 May, 2022; originally announced May 2022.

    Comments: Accepted as a long paper to The 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP)

  39. arXiv:2202.00828  [pdf, other

    cs.CL cs.AI cs.LG

    Co-training Improves Prompt-based Learning for Large Language Models

    Authors: Hunter Lang, Monica Agrawal, Yoon Kim, David Sontag

    Abstract: We demonstrate that co-training (Blum & Mitchell, 1998) can improve the performance of prompt-based learning by using unlabeled data. While prompting has emerged as a promising paradigm for few-shot and zero-shot learning, it is often brittle and requires much larger models compared to the standard supervised setup. We find that co-training makes it possible to improve the original prompt model an… ▽ More

    Submitted 1 February, 2022; originally announced February 2022.

    Comments: 17 pages, 8 figures

  40. arXiv:2111.11297  [pdf, other

    cs.LG cs.HC

    Teaching Humans When To Defer to a Classifier via Exemplars

    Authors: Hussein Mozannar, Arvind Satyanarayan, David Sontag

    Abstract: Expert decision makers are starting to rely on data-driven automated agents to assist them with various tasks. For this collaboration to perform properly, the human decision maker must have a mental model of when and when not to rely on the agent. In this work, we aim to ensure that human decision makers learn a valid mental model of the agent's strengths and weaknesses. To accomplish this goal, w… ▽ More

    Submitted 13 December, 2021; v1 submitted 22 November, 2021; originally announced November 2021.

    Comments: AAAI 2022

  41. arXiv:2111.02599  [pdf, other

    cs.LG

    Leveraging Time Irreversibility with Order-Contrastive Pre-training

    Authors: Monica Agrawal, Hunter Lang, Michael Offin, Lior Gazit, David Sontag

    Abstract: Label-scarce, high-dimensional domains such as healthcare present a challenge for modern machine learning techniques. To overcome the difficulties posed by a lack of labeled data, we explore an "order-contrastive" method for self-supervised pre-training on longitudinal data. We sample pairs of time segments, switch the order for half of them, and train a model to predict whether a given pair is in… ▽ More

    Submitted 29 March, 2022; v1 submitted 3 November, 2021; originally announced November 2021.

  42. arXiv:2110.14993  [pdf, other

    cs.LG stat.ML

    Using Time-Series Privileged Information for Provably Efficient Learning of Prediction Models

    Authors: Rickard K. A. Karlsson, Martin Willbo, Zeshan Hussain, Rahul G. Krishnan, David Sontag, Fredrik D. Johansson

    Abstract: We study prediction of future outcomes with supervised models that use privileged information during learning. The privileged information comprises samples of time series observed between the baseline time of prediction and the future outcome; this information is only available at training time which differs from the traditional supervised learning. Our question is when using this privileged data… ▽ More

    Submitted 5 May, 2022; v1 submitted 28 October, 2021; originally announced October 2021.

    Journal ref: Proceedings of The 25th International Conference on Artificial Intelligence and Statistics, PMLR 151:5459-5484, 2022

  43. arXiv:2110.14508  [pdf, other

    cs.LG cs.AI

    Finding Regions of Heterogeneity in Decision-Making via Expected Conditional Covariance

    Authors: Justin Lim, Christina X Ji, Michael Oberst, Saul Blecker, Leora Horwitz, David Sontag

    Abstract: Individuals often make different decisions when faced with the same context, due to personal preferences and background. For instance, judges may vary in their leniency towards certain drug-related offenses, and doctors may vary in their preference for how to start treatment for certain types of patients. With these examples in mind, we present an algorithm for identifying types of contexts (e.g.,… ▽ More

    Submitted 27 October, 2021; originally announced October 2021.

    Comments: To appear in NeurIPS 2021

  44. MedKnowts: Unified Documentation and Information Retrieval for Electronic Health Records

    Authors: Luke Murray, Divya Gopinath, Monica Agrawal, Steven Horng, David Sontag, David R. Karger

    Abstract: Clinical documentation can be transformed by Electronic Health Records, yet the documentation process is still a tedious, time-consuming, and error-prone process. Clinicians are faced with multi-faceted requirements and fragmented interfaces for information exploration and documentation. These challenges are only exacerbated in the Emergency Department -- clinicians often see 35 patients in one sh… ▽ More

    Submitted 23 September, 2021; originally announced September 2021.

    Comments: 15 Pages, 8 figures, UIST 21, October 10-13

  45. arXiv:2106.02524  [pdf, other

    cs.CL cs.LG stat.ML

    CLIP: A Dataset for Extracting Action Items for Physicians from Hospital Discharge Notes

    Authors: James Mullenbach, Yada Pruksachatkun, Sean Adler, Jennifer Seale, Jordan Swartz, T. Greg McKelvey, Hui Dai, Yi Yang, David Sontag

    Abstract: Continuity of care is crucial to ensuring positive health outcomes for patients discharged from an inpatient hospital setting, and improved information sharing can help. To share information, caregivers write discharge notes containing action items to share with patients and their future caregivers, but these action items are easily lost due to the lengthiness of the documents. In this work, we de… ▽ More

    Submitted 4 June, 2021; originally announced June 2021.

    Comments: ACL 2021

  46. arXiv:2103.04725  [pdf, other

    cs.HC

    Assessing the Impact of Automated Suggestions on Decision Making: Domain Experts Mediate Model Errors but Take Less Initiative

    Authors: Ariel Levy, Monica Agrawal, Arvind Satyanarayan, David Sontag

    Abstract: Automated decision support can accelerate tedious tasks as users can focus their attention where it is needed most. However, a key concern is whether users overly trust or cede agency to automation. In this paper, we investigate the effects of introducing automation to annotating clinical texts--a multi-step, error-prone task of identifying clinical concepts (e.g., procedures) in medical notes, an… ▽ More

    Submitted 29 March, 2021; v1 submitted 8 March, 2021; originally announced March 2021.

    Comments: Fixed minor formatting

  47. arXiv:2103.02477  [pdf, other

    cs.LG stat.ML

    Regularizing towards Causal Invariance: Linear Models with Proxies

    Authors: Michael Oberst, Nikolaj Thams, Jonas Peters, David Sontag

    Abstract: We propose a method for learning linear models whose predictive performance is robust to causal interventions on unobserved variables, when noisy proxies of those variables are available. Our approach takes the form of a regularization term that trades off between in-distribution performance and robustness to interventions. Under the assumption of a linear structural causal model, we show that a s… ▽ More

    Submitted 27 June, 2021; v1 submitted 3 March, 2021; originally announced March 2021.

    Comments: ICML 2021 (to appear)

  48. arXiv:2103.00034  [pdf, other

    stat.ML cs.LG

    Beyond Perturbation Stability: LP Recovery Guarantees for MAP Inference on Noisy Stable Instances

    Authors: Hunter Lang, Aravind Reddy, David Sontag, Aravindan Vijayaraghavan

    Abstract: Several works have shown that perturbation stable instances of the MAP inference problem in Potts models can be solved exactly using a natural linear programming (LP) relaxation. However, most of these works give few (or no) guarantees for the LP solutions on instances that do not satisfy the relatively strict perturbation stability definitions. In this work, we go beyond these stability results b… ▽ More

    Submitted 26 February, 2021; originally announced March 2021.

    Comments: 25 pages, 2 figures, 2 tables. To appear in AISTATS 2021

  49. arXiv:2102.11218  [pdf, other

    cs.LG

    Neural Pharmacodynamic State Space Modeling

    Authors: Zeshan Hussain, Rahul G. Krishnan, David Sontag

    Abstract: Modeling the time-series of high-dimensional, longitudinal data is important for predicting patient disease progression. However, existing neural network based approaches that learn representations of patient state, while very flexible, are susceptible to overfitting. We propose a deep generative model that makes use of a novel attention-based neural architecture inspired by the physics of how tre… ▽ More

    Submitted 17 June, 2021; v1 submitted 22 February, 2021; originally announced February 2021.

    Comments: To appear at the International Conference on Machine Learning (ICML) 2021

  50. arXiv:2102.07005  [pdf, other

    stat.ML cs.LG

    Clustering Interval-Censored Time-Series for Disease Phenotyping

    Authors: Irene Y. Chen, Rahul G. Krishnan, David Sontag

    Abstract: Unsupervised learning is often used to uncover clusters in data. However, different kinds of noise may impede the discovery of useful patterns from real-world time-series data. In this work, we focus on mitigating the interference of interval censoring in the task of clustering for disease phenotyping. We develop a deep generative, continuous-time model of time-series data that clusters time-serie… ▽ More

    Submitted 5 December, 2021; v1 submitted 13 February, 2021; originally announced February 2021.

    Comments: AAAI 2022