Skip to main content

Showing 1–50 of 62 results for author: Stoyanovich, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2603.17373  [pdf, ps, other

    cs.CL

    SafeTutors: Benchmarking Pedagogical Safety in AI Tutoring Systems

    Authors: Rima Hazra, Bikram Ghuku, Ilona Marchenko, Yaroslava Tokarieva, Sayan Layek, Somnath Banerjee, Julia Stoyanovich, Mykola Pechenizkiy

    Abstract: Large language models are rapidly being deployed as AI tutors, yet current evaluation paradigms assess problem-solving accuracy and generic safety in isolation, failing to capture whether a model is simultaneously pedagogically effective and safe across student-tutor interaction. We argue that tutoring safety is fundamentally different from conventional LLM safety: the primary risk is not toxic co… ▽ More

    Submitted 18 March, 2026; originally announced March 2026.

  2. arXiv:2602.21180  [pdf, ps, other

    cs.CY

    Memory Undone: Between Knowing and Not Knowing in Data Systems

    Authors: Viktoriia Makovska, George Fletcher, Julia Stoyanovich, Tetiana Zakharchenko

    Abstract: Machine learning and data systems increasingly function as infrastructures of memory: they ingest, store, and operationalize traces of personal, political, and cultural life. Yet contemporary governance demands credible forms of forgetting, from GDPR-backed deletion to harm-mitigation and the removal of manipulative content, while technical infrastructures are optimized to retain, replicate, and r… ▽ More

    Submitted 24 February, 2026; originally announced February 2026.

    Comments: Undone Computer Science 2026

  3. arXiv:2602.18274  [pdf, ps, other

    cs.DB

    Seasoning Data Modeling Education with GARLIC: A Participatory Co-Design Framework

    Authors: Viktoriia Makovska, Ihor Michurin, Mariia Tokhtamysh, George Fletcher, Julia Stoyanovich

    Abstract: Entity-Relationship (ER) modeling is commonly taught as a primarily technical activity, despite its central role in shaping how data systems represent people, processes, and institutions. Prior research in participatory design demonstrates that involving diverse stakeholders in modeling can surface tacit knowledge, challenge implicit assumptions, and produce more inclusive data representations. Ho… ▽ More

    Submitted 20 February, 2026; originally announced February 2026.

    Comments: DataEd'26: 5th International Workshop on Data Systems Education

  4. arXiv:2601.23068  [pdf, ps, other

    cs.LG cs.AI

    ExplainerPFN: Towards tabular foundation models for model-free zero-shot feature importance estimations

    Authors: Joao Fonseca, Julia Stoyanovich

    Abstract: Computing the importance of features in supervised classification tasks is critical for model interpretability. Shapley values are a widely used approach for explaining model predictions, but require direct access to the underlying model, an assumption frequently violated in real-world deployments. Further, even when model access is possible, their exact computation may be prohibitively expensive.… ▽ More

    Submitted 30 January, 2026; originally announced January 2026.

    Comments: 18 pages, 7 figures

  5. arXiv:2601.12654  [pdf, ps, other

    cs.LG cs.AI

    Explanation Multiplicity in SHAP: Characterization and Assessment

    Authors: Hyunseung Hwang, Seungeun Lee, Lucas Rosenblatt, Steven Euijong Whang, Julia Stoyanovich

    Abstract: Post-hoc explanations are widely used to justify, contest, and review automated decisions in high-stakes domains such as lending, employment, and healthcare. Among these methods, SHAP is often treated as providing a reliable account of which features mattered for an individual prediction and is routinely used to support recourse, oversight, and accountability. In practice, however, SHAP explanatio… ▽ More

    Submitted 25 January, 2026; v1 submitted 18 January, 2026; originally announced January 2026.

  6. arXiv:2507.08702  [pdf, ps, other

    cs.DB cs.AI cs.CY

    ONION: A Multi-Layered Framework for Participatory ER Design

    Authors: Viktoriia Makovska, George Fletcher, Julia Stoyanovich

    Abstract: We present ONION, a multi-layered framework for participatory Entity-Relationship (ER) modeling that integrates insights from design justice, participatory AI, and conceptual modeling. ONION introduces a five-stage methodology: Observe, Nurture, Integrate, Optimize, Normalize. It supports progressive abstraction from unstructured stakeholder input to structured ER diagrams. Our approach aims to… ▽ More

    Submitted 11 July, 2025; originally announced July 2025.

  7. We Are AI: Taking Control of Technology

    Authors: Julia Stoyanovich, Armanda Lewis, Eric Corbett, Lucius E. J. Bynum, Lucas Rosenblatt, Falaah Arif Khan

    Abstract: Responsible AI (RAI) is the science and practice of ensuring the design, development, use, and oversight of AI are socially sustainable--benefiting diverse stakeholders while controlling the risks. Achieving this goal requires active engagement and participation from the broader public. This paper introduces "We are AI: Taking Control of Technology," a public education course that brings the topic… ▽ More

    Submitted 9 June, 2025; originally announced June 2025.

    Journal ref: Proceedings of the AAAI Conference on Artificial Intelligence, 2025

  8. arXiv:2506.01584  [pdf, ps, other

    cs.LG cs.AI cs.CY

    VirnyFlow: A Design Space for Responsible Model Development

    Authors: Denys Herasymuk, Nazar Protsiv, Julia Stoyanovich

    Abstract: Developing machine learning (ML) models requires a deep understanding of real-world problems, which are inherently multi-objective. In this paper, we present VirnyFlow, the first design space for responsible model development, designed to assist data scientists in building ML pipelines that are tailored to the specific context of their problem. Unlike conventional AutoML frameworks, VirnyFlow enab… ▽ More

    Submitted 2 June, 2025; originally announced June 2025.

  9. arXiv:2505.08345  [pdf, ps, other

    cs.LG cs.AI

    SHAP-based Explanations are Sensitive to Feature Representation

    Authors: Hyunseung Hwang, Andrew Bell, Joao Fonseca, Venetia Pliatsika, Julia Stoyanovich, Steven Euijong Whang

    Abstract: Local feature-based explanations are a key component of the XAI toolkit. These explanations compute feature importance values relative to an ``interpretable'' feature representation. In tabular data, feature values themselves are often considered interpretable. This paper examines the impact of data engineering choices on local feature-based explanations. We demonstrate that simple, common data en… ▽ More

    Submitted 13 May, 2025; originally announced May 2025.

    Comments: Accepted to ACM FAccT 2025

  10. arXiv:2504.14368  [pdf, other

    cs.LG cs.CR

    Do You Really Need Public Data? Surrogate Public Data for Differential Privacy on Tabular Data

    Authors: Shlomi Hod, Lucas Rosenblatt, Julia Stoyanovich

    Abstract: Differentially private (DP) machine learning often relies on the availability of public data for tasks like privacy-utility trade-off estimation, hyperparameter tuning, and pretraining. While public data assumptions may be reasonable in text and image domains, they are less likely to hold for tabular data due to tabular data heterogeneity across domains. We propose leveraging powerful priors to ad… ▽ More

    Submitted 19 April, 2025; originally announced April 2025.

  11. arXiv:2504.11259  [pdf, ps, other

    cs.DB

    The Cambridge Report on Database Research

    Authors: Anastasia Ailamaki, Samuel Madden, Daniel Abadi, Gustavo Alonso, Sihem Amer-Yahia, Magdalena Balazinska, Philip A. Bernstein, Peter Boncz, Michael Cafarella, Surajit Chaudhuri, Susan Davidson, David DeWitt, Yanlei Diao, Xin Luna Dong, Michael Franklin, Juliana Freire, Johannes Gehrke, Alon Halevy, Joseph M. Hellerstein, Mark D. Hill, Stratos Idreos, Yannis Ioannidis, Christoph Koch, Donald Kossmann, Tim Kraska , et al. (21 additional authors not shown)

    Abstract: On October 19 and 20, 2023, the authors of this report convened in Cambridge, MA, to discuss the state of the database research field, its recent accomplishments and ongoing challenges, and future directions for research and community engagement. This gathering continues a long standing tradition in the database community, dating back to the late 1980s, in which researchers meet roughly every five… ▽ More

    Submitted 15 April, 2025; originally announced April 2025.

  12. arXiv:2503.02885  [pdf, ps, other

    cs.CY cs.CL cs.HC

    "Would You Want an AI Tutor?" Understanding Stakeholder Perceptions of LLM-based Systems in the Classroom

    Authors: Caterina Fuligni, Daniel Dominguez Figaredo, Julia Stoyanovich

    Abstract: In recent years, Large Language Models (LLMs) rapidly gained popularity across all parts of society, including education. After initial skepticism and bans, many schools have chosen to embrace this new technology by integrating it into their curricula in the form of virtual tutors and teaching assistants. However, neither the companies developing this technology nor the public institutions involve… ▽ More

    Submitted 9 June, 2025; v1 submitted 2 February, 2025; originally announced March 2025.

  13. arXiv:2502.07943  [pdf, other

    cs.DB cs.AI cs.CY

    CREDAL: Close Reading of Data Models

    Authors: George Fletcher, Olha Nahurna, Matvii Prytula, Julia Stoyanovich

    Abstract: Data models are necessary for the birth of data and of any data-driven system. Indeed, every algorithm, every machine learning model, every statistical model, and every database has an underlying data model without which the system would not be usable. Hence, data models are excellent sites for interrogating the (material, social, political, ...) conditions giving rise to a data system. Towards th… ▽ More

    Submitted 11 February, 2025; originally announced February 2025.

  14. arXiv:2501.02018  [pdf, other

    cs.CL cs.AI cs.CR cs.LG

    Safeguarding Large Language Models in Real-time with Tunable Safety-Performance Trade-offs

    Authors: Joao Fonseca, Andrew Bell, Julia Stoyanovich

    Abstract: Large Language Models (LLMs) have been shown to be susceptible to jailbreak attacks, or adversarial attacks used to illicit high risk behavior from a model. Jailbreaks have been exploited by cybercriminals and blackhat actors to cause significant harm, highlighting the critical need to safeguard widely-deployed models. Safeguarding approaches, which include fine-tuning models or having LLMs "self-… ▽ More

    Submitted 2 January, 2025; originally announced January 2025.

  15. arXiv:2412.15363  [pdf, other

    cs.CY cs.AI cs.HC

    Making Transparency Advocates: An Educational Approach Towards Better Algorithmic Transparency in Practice

    Authors: Andrew Bell, Julia Stoyanovich

    Abstract: Concerns about the risks and harms posed by artificial intelligence (AI) have resulted in significant study into algorithmic transparency, giving rise to a sub-field known as Explainable AI (XAI). Unfortunately, despite a decade of development in XAI, an existential challenge remains: progress in research has not been fully translated into the actual implementation of algorithmic transparency by o… ▽ More

    Submitted 19 December, 2024; originally announced December 2024.

  16. arXiv:2412.13030  [pdf, other

    cs.HC cs.CR cs.DB

    Are Data Experts Buying into Differentially Private Synthetic Data? Gathering Community Perspectives

    Authors: Lucas Rosenblatt, Bill Howe, Julia Stoyanovich

    Abstract: Data privacy is a core tenet of responsible computing, and in the United States, differential privacy (DP) is the dominant technical operationalization of privacy-preserving data analysis. With this study, we qualitatively examine one class of DP mechanisms: private data synthesizers. To that end, we conducted semi-structured interviews with data experts: academics and practitioners who regularly… ▽ More

    Submitted 17 December, 2024; originally announced December 2024.

  17. arXiv:2409.07510  [pdf, ps, other

    cs.AI cs.CY cs.LG

    Still More Shades of Null: An Evaluation Suite for Responsible Missing Value Imputation

    Authors: Falaah Arif Khan, Denys Herasymuk, Nazar Protsiv, Julia Stoyanovich

    Abstract: Data missingness is a practical challenge of sustained interest to the scientific community. In this paper, we present Shades-of-Null, an evaluation suite for responsible missing value imputation. Our work is novel in two ways (i) we model realistic and socially-salient missingness scenarios that go beyond Rubin's classic Missing Completely at Random (MCAR), Missing At Random (MAR) and Missing Not… ▽ More

    Submitted 18 July, 2025; v1 submitted 11 September, 2024; originally announced September 2024.

  18. arXiv:2407.14686  [pdf, other

    cs.HC cs.CY

    Using Case Studies to Teach Responsible AI to Industry Practitioners

    Authors: Julia Stoyanovich, Rodrigo Kreis de Paula, Armanda Lewis, Chloe Zheng

    Abstract: Responsible AI (RAI) encompasses the science and practice of ensuring that AI design, development, and use are socially sustainable -- maximizing the benefits of technology while mitigating its risks. Industry practitioners play a crucial role in achieving the objectives of RAI, yet there is a persistent a shortage of consolidated educational resources and effective methods for teaching RAI to pra… ▽ More

    Submitted 20 December, 2024; v1 submitted 19 July, 2024; originally announced July 2024.

  19. arXiv:2403.17786  [pdf, other

    cs.DB

    Query Refinement for Diverse Top-$k$ Selection

    Authors: Felix S. Campbell, Alon Silberstein, Julia Stoyanovich, Yuval Moskovitch

    Abstract: Database queries are often used to select and rank items as decision support for many applications. As automated decision-making tools become more prevalent, there is a growing recognition of the need to diversify their outcomes. In this paper, we define and study the problem of modifying the selection conditions of an ORDER BY query so that the result of the modified query closely fits some user-… ▽ More

    Submitted 27 March, 2024; v1 submitted 26 March, 2024; originally announced March 2024.

    Comments: v2 corrects author order

  20. ShaRP: Explaining Rankings and Preferences with Shapley Values

    Authors: Venetia Pliatsika, Joao Fonseca, Kateryna Akhynko, Ivan Shevchenko, Julia Stoyanovich

    Abstract: Algorithmic decisions in critical domains such as hiring, college admissions, and lending are often based on rankings. Given the impact of these decisions on individuals, organizations, and population groups, it is essential to understand them - to help individuals improve their ranking position, design better ranking procedures, and ensure legal compliance. In this paper, we argue that explainabi… ▽ More

    Submitted 28 July, 2025; v1 submitted 29 January, 2024; originally announced January 2024.

    Comments: Accepted in VLDB

    Journal ref: VLDB, Volume 18, Issue 11, Year 2025

  21. arXiv:2401.16088  [pdf, other

    cs.LG cs.CY

    Fairness in Algorithmic Recourse Through the Lens of Substantive Equality of Opportunity

    Authors: Andrew Bell, Joao Fonseca, Carlo Abrate, Francesco Bonchi, Julia Stoyanovich

    Abstract: Algorithmic recourse -- providing recommendations to those affected negatively by the outcome of an algorithmic system on how they can take action and change that outcome -- has gained attention as a means of giving persons agency in their interactions with artificial intelligence (AI) systems. Recent work has shown that even if an AI decision-making classifier is ``fair'' (according to some reaso… ▽ More

    Submitted 29 January, 2024; originally announced January 2024.

  22. arXiv:2401.13935  [pdf, other

    cs.AI cs.CY stat.ML

    A New Paradigm for Counterfactual Reasoning in Fairness and Recourse

    Authors: Lucius E. J. Bynum, Joshua R. Loftus, Julia Stoyanovich

    Abstract: Counterfactuals and counterfactual reasoning underpin numerous techniques for auditing and understanding artificial intelligence (AI) systems. The traditional paradigm for counterfactual reasoning in this literature is the interventional counterfactual, where hypothetical interventions are imagined and simulated. For this reason, the starting point for causal reasoning about legal protections and… ▽ More

    Submitted 24 January, 2024; originally announced January 2024.

  23. arXiv:2312.11712  [pdf, other

    cs.CR cs.LG

    A Simple and Practical Method for Reducing the Disparate Impact of Differential Privacy

    Authors: Lucas Rosenblatt, Julia Stoyanovich, Christopher Musco

    Abstract: Differentially private (DP) mechanisms have been deployed in a variety of high-impact social settings (perhaps most notably by the U.S. Census). Since all DP mechanisms involve adding noise to results of statistical queries, they are expected to impact our ability to accurately analyze and learn from data, in effect trading off privacy with utility. Alarmingly, the impact of DP on utility can vary… ▽ More

    Submitted 18 December, 2023; originally announced December 2023.

  24. arXiv:2309.06969  [pdf, other

    cs.LG cs.AI cs.CY

    Setting the Right Expectations: Algorithmic Recourse Over Time

    Authors: Joao Fonseca, Andrew Bell, Carlo Abrate, Francesco Bonchi, Julia Stoyanovich

    Abstract: Algorithmic systems are often called upon to assist in high-stakes decision making. In light of this, algorithmic recourse, the principle wherein individuals should be able to take action against an undesirable outcome made by an algorithmic system, is receiving growing attention. The bulk of the literature on algorithmic recourse to-date focuses primarily on how to provide recourse to a single in… ▽ More

    Submitted 13 September, 2023; originally announced September 2023.

  25. arXiv:2302.08704  [pdf, other

    cs.LG cs.CY

    The Unbearable Weight of Massive Privilege: Revisiting Bias-Variance Trade-Offs in the Context of Fair Prediction

    Authors: Falaah Arif Khan, Julia Stoyanovich

    Abstract: In this paper we revisit the bias-variance decomposition of model error from the perspective of designing a fair classifier: we are motivated by the widely held socio-technical belief that noise variance in large datasets in social domains tracks demographic characteristics such as gender, race, disability, etc. We propose a conditional-iid (ciid) model built from group-specific classifiers that s… ▽ More

    Submitted 17 February, 2023; originally announced February 2023.

  26. arXiv:2302.06347  [pdf, other

    cs.LG

    The Possibility of Fairness: Revisiting the Impossibility Theorem in Practice

    Authors: Andrew Bell, Lucius Bynum, Nazarii Drushchak, Tetiana Herasymova, Lucas Rosenblatt, Julia Stoyanovich

    Abstract: The ``impossibility theorem'' -- which is considered foundational in algorithmic fairness literature -- asserts that there must be trade-offs between common notions of fairness and performance when fitting statistical models, except in two special cases: when the prevalence of the outcome being predicted is equal across groups, or when a perfectly accurate predictor is used. However, theory does n… ▽ More

    Submitted 13 February, 2023; originally announced February 2023.

    Comments: 14 pages, 3 figures, 1 table

  27. arXiv:2302.04525  [pdf, ps, other

    cs.LG cs.AI cs.CY

    An Epistemic and Aleatoric Decomposition of Arbitrariness to Constrain the Set of Good Models

    Authors: Falaah Arif Khan, Denys Herasymuk, Nazar Protsiv, Julia Stoyanovich

    Abstract: Recent research reveals that machine learning (ML) models are highly sensitive to minor changes in their training procedure, such as the inclusion or exclusion of a single data point, leading to conflicting predictions on individual data points; a property termed as arbitrariness or instability in ML pipelines in prior work. Drawing from the uncertainty literature, we show that stability decompose… ▽ More

    Submitted 12 July, 2025; v1 submitted 9 February, 2023; originally announced February 2023.

  28. arXiv:2212.03974  [pdf, other

    cs.AI cs.CY cs.LG stat.ME stat.ML

    Counterfactuals for the Future

    Authors: Lucius E. J. Bynum, Joshua R. Loftus, Julia Stoyanovich

    Abstract: Counterfactuals are often described as 'retrospective,' focusing on hypothetical alternatives to a realized past. This description relates to an often implicit assumption about the structure and stability of exogenous variables in the system being modeled -- an assumption that is reasonable in many settings where counterfactuals are used. In this work, we consider cases where we might reasonably m… ▽ More

    Submitted 7 December, 2022; originally announced December 2022.

  29. arXiv:2211.02932  [pdf, other

    cs.HC

    Rankers, Rankees, & Rankings: Peeking into the Pandora's Box from a Socio-Technical Perspective

    Authors: Jun Yuan, Julia Stoyanovich, Aritra Dasgupta

    Abstract: Algorithmic rankers have a profound impact on our increasingly data-driven society. From leisurely activities like the movies that we watch, the restaurants that we patronize; to highly consequential decisions, like making educational and occupational choices or getting hired by companies -- these are all driven by sophisticated yet mostly inaccessible rankers. A small change to how these algorith… ▽ More

    Submitted 5 November, 2022; originally announced November 2022.

    Comments: Accepted for Interrogating Human-Centered Data Science workshop at CHI'22

  30. arXiv:2208.12700  [pdf, other

    cs.CR cs.CY

    Epistemic Parity: Reproducibility as an Evaluation Metric for Differential Privacy

    Authors: Lucas Rosenblatt, Bernease Herman, Anastasia Holovenko, Wonkwon Lee, Joshua Loftus, Elizabeth McKinnie, Taras Rumezhak, Andrii Stadnik, Bill Howe, Julia Stoyanovich

    Abstract: Differential privacy (DP) data synthesizers support public release of sensitive information, offering theoretical guarantees for privacy but limited evidence of utility in practical settings. Utility is typically measured as the error on representative proxy tasks, such as descriptive statistics, accuracy of trained classifiers, or performance over a query workload. The ability for these results t… ▽ More

    Submitted 31 May, 2023; v1 submitted 26 August, 2022; originally announced August 2022.

    Comments: Preprint. 14 pages

  31. arXiv:2207.02912  [pdf, other

    cs.CY cs.AI cs.LG

    Towards Substantive Conceptions of Algorithmic Fairness: Normative Guidance from Equal Opportunity Doctrines

    Authors: Falaah Arif Khan, Eleni Manis, Julia Stoyanovich

    Abstract: In this work we use Equal Oppportunity (EO) doctrines from political philosophy to make explicit the normative judgements embedded in different conceptions of algorithmic fairness. We contrast formal EO approaches that narrowly focus on fair contests at discrete decision points, with substantive EO doctrines that look at people's fair life chances more holistically over the course of a lifetime. W… ▽ More

    Submitted 10 July, 2022; v1 submitted 6 July, 2022; originally announced July 2022.

  32. arXiv:2207.01482  [pdf, other

    cs.CY cs.HC cs.LG

    Think About the Stakeholders First! Towards an Algorithmic Transparency Playbook for Regulatory Compliance

    Authors: Andrew Bell, Oded Nov, Julia Stoyanovich

    Abstract: Increasingly, laws are being proposed and passed by governments around the world to regulate Artificial Intelligence (AI) systems implemented into the public and private sectors. Many of these regulations address the transparency of AI systems, and related citizen-aware issues like allowing individuals to have the right to an explanation about how an AI system makes a decision that impacts them. Y… ▽ More

    Submitted 10 June, 2022; originally announced July 2022.

  33. arXiv:2205.14269  [pdf, other

    cs.DB

    Temporal graph patterns by timed automata

    Authors: Amir Pouya Aghasadeghi, Jan Van den Bussche, Julia Stoyanovich

    Abstract: Temporal graphs represent graph evolution over time, and have been receiving considerable research attention. Work on expressing temporal graph patterns or discovering temporal motifs typically assumes relatively simple temporal constraints, such as journeys or, more generally, existential constraints, possibly with finite delays. In this paper we propose to use timed automata to express temporal… ▽ More

    Submitted 27 May, 2022; originally announced May 2022.

  34. arXiv:2204.12903  [pdf, other

    cs.LG cs.CR

    Spending Privacy Budget Fairly and Wisely

    Authors: Lucas Rosenblatt, Joshua Allen, Julia Stoyanovich

    Abstract: Differentially private (DP) synthetic data generation is a practical method for improving access to data as a means to encourage productive partnerships. One issue inherent to DP is that the "privacy budget" is generally "spent" evenly across features in the data set. This leads to good statistical parity with the real data, but can undervalue the conditional probabilities and marginals that are c… ▽ More

    Submitted 27 April, 2022; originally announced April 2022.

  35. arXiv:2201.09151  [pdf, other

    cs.CY cs.AI cs.LG

    An External Stability Audit Framework to Test the Validity of Personality Prediction in AI Hiring

    Authors: Alene K. Rhea, Kelsey Markey, Lauren D'Arinzo, Hilke Schellmann, Mona Sloane, Paul Squires, Falaah Arif Kahn, Julia Stoyanovich

    Abstract: Automated hiring systems are among the fastest-developing of all high-stakes AI systems. Among these are algorithmic personality tests that use insights from psychometric testing, and promise to surface personality traits indicative of future success based on job seekers' resumes or social media profiles. We interrogate the validity of such systems using stability of the outputs they produce, noti… ▽ More

    Submitted 11 April, 2022; v1 submitted 22 January, 2022; originally announced January 2022.

  36. arXiv:2107.01241  [pdf, other

    cs.DB

    Temporal Regular Path Queries

    Authors: Marcelo Arenas, Pedro Bahamondes, Amir Aghasadeghi, Julia Stoyanovich

    Abstract: In the last decade, substantial progress has been made towards standardizing the syntax of graph query languages, and towards understanding their semantics and complexity of evaluation. In this paper, we consider temporal property graphs (TPGs) and propose temporal regular path queries (TRPQs) that incorporate time into TPG navigation. Starting with design principles, we propose a natural syntacti… ▽ More

    Submitted 9 March, 2022; v1 submitted 2 July, 2021; originally announced July 2021.

  37. arXiv:2107.00593  [pdf, other

    cs.LG cs.AI cs.CY stat.AP stat.ML

    Disaggregated Interventions to Reduce Inequality

    Authors: Lucius E. J. Bynum, Joshua R. Loftus, Julia Stoyanovich

    Abstract: A significant body of research in the data sciences considers unfair discrimination against social categories such as race or gender that could occur or be amplified as a result of algorithmic decisions. Simultaneously, real-world disparities continue to exist, even before algorithmic decisions are made. In this work, we draw on insights from the social sciences brought into the realm of causal mo… ▽ More

    Submitted 7 December, 2022; v1 submitted 1 July, 2021; originally announced July 2021.

  38. arXiv:2106.08259  [pdf, other

    cs.CY cs.AI cs.LG

    Fairness as Equality of Opportunity: Normative Guidance from Political Philosophy

    Authors: Falaah Arif Khan, Eleni Manis, Julia Stoyanovich

    Abstract: Recent interest in codifying fairness in Automated Decision Systems (ADS) has resulted in a wide range of formulations of what it means for an algorithmic system to be fair. Most of these propositions are inspired by, but inadequately grounded in, political philosophy scholarship. This paper aims to correct that deficit. We introduce a taxonomy of fairness ideals using doctrines of Equality of Opp… ▽ More

    Submitted 15 June, 2021; originally announced June 2021.

  39. Most Expected Winner: An Interpretation of Winners over Uncertain Voter Preferences

    Authors: Haoyue Ping, Julia Stoyanovich

    Abstract: It remains an open question how to determine the winner of an election when voter preferences are incomplete or uncertain. One option is to assume some probability space over the voting profile and select the Most Probable Winner (MPW) -- the candidate or candidates with the best chance of winning. In this paper, we propose an alternative winner interpretation, selecting the Most Expected Winner (… ▽ More

    Submitted 25 April, 2023; v1 submitted 30 April, 2021; originally announced May 2021.

    Comments: This is the technical report of the following paper: Haoyue Ping and Julia Stoyanovich. 2023. Most Expected Winner: An Interpretation of Winners over Uncertain Voter Preferences. Proc. ACM Manag. Data, 1, N1, Article 22 (May 2023), 33 pages. https://doi.org/10.1145/3588702

    Journal ref: Proc. ACM Manag. Data, 1, N1, Article 22 (May 2023), 33 pages (2023)

  40. Fairness in Ranking: A Survey

    Authors: Meike Zehlike, Ke Yang, Julia Stoyanovich

    Abstract: In the past few years, there has been much work on incorporating fairness requirements into algorithmic rankers, with contributions coming from the data management, algorithms, information retrieval, and recommender systems communities. In this survey we give a systematic overview of this work, offering a broad perspective that connects formalizations and algorithmic approaches across subfields. A… ▽ More

    Submitted 12 August, 2022; v1 submitted 25 March, 2021; originally announced March 2021.

    Comments: 72 pages. ACM CSUR (2022)

    ACM Class: I.2.6; I.2.8; H.3.3

  41. arXiv:2006.08688  [pdf, other

    cs.LG cs.AI stat.AP stat.ML

    Causal intersectionality for fair ranking

    Authors: Ke Yang, Joshua R. Loftus, Julia Stoyanovich

    Abstract: In this paper we propose a causal modeling approach to intersectional fairness, and a flexible, task-specific method for computing intersectionally fair rankings. Rankings are used in many contexts, ranging from Web search results to college admissions, but causal inference for fair rankings has received limited attention. Additionally, the growing literature on causal fairness has directed little… ▽ More

    Submitted 15 June, 2020; originally announced June 2020.

  42. arXiv:2005.06779  [pdf, other

    cs.GT

    Algorithmic Techniques for Necessary and Possible Winners

    Authors: Vishal Chakraborty, Theo Delemazure, Benny Kimelfeld, Phokion G. Kolaitis, Kunal Relia, Julia Stoyanovich

    Abstract: We investigate the practical aspects of computing the necessary and possible winners in elections over incomplete voter preferences. In the case of the necessary winners, we show how to implement and accelerate the polynomial-time algorithm of Xia and Conitzer. In the case of the possible winners, where the problem is NP-hard, we give a natural reduction to Integer Linear Programming (ILP) for all… ▽ More

    Submitted 14 May, 2020; originally announced May 2020.

  43. arXiv:2003.06984  [pdf, other

    cs.DB

    Supporting Hard Queries over Probabilistic Preferences

    Authors: Haoyue Ping, Julia Stoyanovich, Benny Kimelfeld

    Abstract: Preference analysis is widely applied in various domains such as social choice and e-commerce. A recently proposed framework augments the relational database with a preference relation that represents uncertain preferences in the form of statistical ranking models, and provides methods to evaluate Conjunctive Queries (CQs) that express preferences among item attributes. In this paper, we explore t… ▽ More

    Submitted 15 March, 2020; originally announced March 2020.

    Comments: This is the technical report of the following paper: Supporting Hard Queries over Probabilistic Preferences. PVLDB, 13(7): 1134-1146, 2019. DOI: https://doi.org/10.14778/3384345.3384359

  44. arXiv:1912.10564  [pdf, other

    cs.CY cs.AI cs.LG

    Teaching Responsible Data Science: Charting New Pedagogical Territory

    Authors: Julia Stoyanovich, Armanda Lewis

    Abstract: Although numerous ethics courses are available, with many focusing specifically on technology and computer ethics, pedagogical approaches employed in these courses rely exclusively on texts rather than on software development or data analysis. Technical students often consider these courses unimportant and a distraction from the "real" material. To develop instructional materials and methodologies… ▽ More

    Submitted 22 December, 2019; originally announced December 2019.

  45. arXiv:1911.12587  [pdf, other

    cs.LG cs.CY cs.DB stat.ML

    FairPrep: Promoting Data to a First-Class Citizen in Studies on Fairness-Enhancing Interventions

    Authors: Sebastian Schelter, Yuxuan He, Jatin Khilnani, Julia Stoyanovich

    Abstract: The importance of incorporating ethics and legal compliance into machine-assisted decision-making is broadly recognized. Further, several lines of recent work have argued that critical opportunities for improving data quality and representativeness, controlling for bias, and allowing humans to oversee and impact computational processes are missed if we do not consider the lifecycle stages upstream… ▽ More

    Submitted 28 November, 2019; originally announced November 2019.

  46. arXiv:1906.01747  [pdf, other

    cs.AI cs.CY

    Balanced Ranking with Diversity Constraints

    Authors: Ke Yang, Vasilis Gkatzelis, Julia Stoyanovich

    Abstract: Many set selection and ranking algorithms have recently been enhanced with diversity constraints that aim to explicitly increase representation of historically disadvantaged populations, or to improve the overall representativeness of the selected set. An unintended consequence of these constraints, however, is reduced in-group fairness: the selected candidates from a given group may not be the be… ▽ More

    Submitted 4 June, 2019; originally announced June 2019.

    Comments: to appear in IJCAI 2019

  47. arXiv:1903.03683  [pdf, ps, other

    cs.DB cs.CY

    Transparency, Fairness, Data Protection, Neutrality: Data Management Challenges in the Face of New Regulation

    Authors: Serge Abiteboul, Julia Stoyanovich

    Abstract: The data revolution continues to transform every sector of science, industry and government. Due to the incredible impact of data-driven technology on society, we are becoming increasingly aware of the imperative to use data and algorithms responsibly -- in accordance with laws and ethical norms. In this article we discuss three recent regulatory frameworks: the European Union's General Data Prote… ▽ More

    Submitted 8 March, 2019; originally announced March 2019.

    Comments: To appear in the ACM Journal of Data and Information Quality (JDIQ)

  48. MobilityMirror: Bias-Adjusted Transportation Datasets

    Authors: Luke Rodriguez, Babak Salimi, Haoyue Ping, Julia Stoyanovich, Bill Howe

    Abstract: We describe customized synthetic datasets for publishing mobility data. Private companies are providing new transportation modalities, and their data is of high value for integrative transportation research, policy enforcement, and public accountability. However, these companies are disincentivized from sharing data not only to protect the privacy of individuals (drivers and/or passengers), but al… ▽ More

    Submitted 24 January, 2019; v1 submitted 21 August, 2018; originally announced August 2018.

    Comments: Presented at BIDU 2018 workshop and published in Springer Communications in Computer and Information Science vol 926

    Journal ref: Big Social Data and Urban Computing. BiDU 2018. Communications in Computer and Information Science, vol 926. Springer, Cham

  49. arXiv:1805.04156  [pdf, ps, other

    cs.DB cs.AI

    Computational Social Choice Meets Databases

    Authors: Benny Kimelfeld, Phokion G. Kolaitis, Julia Stoyanovich

    Abstract: We develop a novel framework that aims to create bridges between the computational social choice and the database management communities. This framework enriches the tasks currently supported in computational social choice with relational database context, thus making it possible to formulate sophisticated queries about voting rules, candidates, voters, issues, and positions. At the conceptual lev… ▽ More

    Submitted 10 May, 2018; originally announced May 2018.

    Comments: This is an extended version of "Computational Social Choice Meets Databases" by Kimelfeld, Kolaitis and Stoyanovich, to appear in IJCAI 2018

  50. On Obtaining Stable Rankings

    Authors: Abolfazl Asudeh, H. V. Jagadish, Gerome Miklau, Julia Stoyanovich

    Abstract: Decision making is challenging when there is more than one criterion to consider. In such cases, it is common to assign a goodness score to each item as a weighted sum of its attribute values and rank them accordingly. Clearly, the ranking obtained depends on the weights used for this summation. Ideally, one would want the ranked order not to change if the weights are changed slightly. We call thi… ▽ More

    Submitted 18 December, 2018; v1 submitted 29 April, 2018; originally announced April 2018.

    Journal ref: Abolfazl Asudeh, H. V. Jagadish, Gerome Miklau, Julia Stoyanovich. On Obtaining Stable Rankings. PVLDB , 12(3): 237-250, 2018