arXiv:2601.12913 [pdf, ps, other]

Actionable Interpretability Must Be Defined in Terms of Symmetries

Authors: Pietro Barbiero, Mateo Espinosa Zarlenga, Francesco Giannini, Alberto Termine, Filippo Bonchi, Mateja Jamnik, Giuseppe Marra

Abstract: This paper argues that interpretability research in Artificial Intelligence (AI) is fundamentally ill-posed as existing definitions of interpretability fail to describe how interpretability can be formally tested or designed for. We posit that actionable definitions of interpretability must be formulated in terms of *symmetries* that inform model design and lead to testable conditions. Under a pro… ▽ More This paper argues that interpretability research in Artificial Intelligence (AI) is fundamentally ill-posed as existing definitions of interpretability fail to describe how interpretability can be formally tested or designed for. We posit that actionable definitions of interpretability must be formulated in terms of *symmetries* that inform model design and lead to testable conditions. Under a probabilistic view, we hypothesise that four symmetries (inference equivariance, information invariance, concept-closure invariance, and structural invariance) suffice to (i) formalise interpretable models as a subclass of probabilistic models, (ii) yield a unified formulation of interpretable inference (e.g., alignment, interventions, and counterfactuals) as a form of Bayesian inversion, and (iii) provide a formal framework to verify compliance with safety standards and regulations. △ Less

Submitted 29 January, 2026; v1 submitted 19 January, 2026; originally announced January 2026.

arXiv:2601.01472 [pdf, ps, other]

Tapes as Stochastic Matrices of String Diagrams

Authors: Filippo Bonchi, Cipriano Junior Cioffo

Abstract: Tape diagrams provide a graphical notation for categories equipped with two monoidal products, $\otimes$ and $\oplus$, where $\oplus$ is a biproduct. Recently, they have been generalised to handle Kleisli categories of arbitrary monoidal monads. In this work, we show that for the subdistribution monad, tapes are isomorphic to stochastic matrices of subdistributions of string diagrams. We then expl… ▽ More Tape diagrams provide a graphical notation for categories equipped with two monoidal products, $\otimes$ and $\oplus$, where $\oplus$ is a biproduct. Recently, they have been generalised to handle Kleisli categories of arbitrary monoidal monads. In this work, we show that for the subdistribution monad, tapes are isomorphic to stochastic matrices of subdistributions of string diagrams. We then exploit this result to provide a complete axiomatisation of probabilistic Boolean circuits. △ Less

Submitted 4 January, 2026; originally announced January 2026.

arXiv:2512.07240 [pdf, ps, other]

A Diagrammatic Basis for Computer Programming

Authors: Filippo Bonchi, Alessandro Di Giorgio, Elena Di Lavore

Abstract: Tape diagrams provide a convenient graphical notation for arrows of rig categories, i.e., categories equipped with two monoidal products, $\oplus$ and $\otimes$. In this work, we introduce Kleene-Cartesian rig categories, namely rig categories where $\otimes$ provides a Cartesian bicategory, while $\oplus$ a Kleene bicategory. We show that the associated tape diagrams can conveniently deal with im… ▽ More Tape diagrams provide a convenient graphical notation for arrows of rig categories, i.e., categories equipped with two monoidal products, $\oplus$ and $\otimes$. In this work, we introduce Kleene-Cartesian rig categories, namely rig categories where $\otimes$ provides a Cartesian bicategory, while $\oplus$ a Kleene bicategory. We show that the associated tape diagrams can conveniently deal with imperative programs and various program logic. △ Less

Submitted 8 December, 2025; originally announced December 2025.

arXiv:2512.00107 [pdf, ps, other]

A Survey on Centrality and Importance Measures in Hypergraphs: Categorization and Empirical Insights

Authors: Jaewan Chun, Fanchen Bu, Yeongho Kim, Atsushi Miyauchi, Francesco Bonchi, Kijung Shin

Abstract: Identifying central entities and interactions is a fundamental problem in network science. While well-studied for graphs (pairwise relations), many biological and social systems exhibit higher-order interactions best modeled by hypergraphs. This has led to a proliferation of specialized hypergraph centrality measures, but the field remains fragmented and lacks a unifying framework. This paper addr… ▽ More Identifying central entities and interactions is a fundamental problem in network science. While well-studied for graphs (pairwise relations), many biological and social systems exhibit higher-order interactions best modeled by hypergraphs. This has led to a proliferation of specialized hypergraph centrality measures, but the field remains fragmented and lacks a unifying framework. This paper addresses this gap by providing the first systematic survey of 39 distinct measures. We introduce a novel taxonomy classifying them as: (1) structural (topology-based), (2) functional (impact on system dynamics), or (3) contextual (incorporating external features). We also present an experimental assessment comparing their empirical similarity and computation time. Finally, we discuss applications, establishing a coherent roadmap for future research in this area. △ Less

Submitted 27 November, 2025; originally announced December 2025.

arXiv:2510.00803 [pdf, ps, other]

Online Minimization of Polarization and Disagreement via Low-Rank Matrix Bandits

Authors: Federico Cinus, Yuko Kuroki, Atsushi Miyauchi, Francesco Bonchi

Abstract: We study the problem of minimizing polarization and disagreement in the Friedkin-Johnsen opinion dynamics model under incomplete information. Unlike prior work that assumes a static setting with full knowledge of agents' innate opinions, we address the more realistic online setting where innate opinions are unknown and must be learned through sequential observations. This novel setting, which natu… ▽ More We study the problem of minimizing polarization and disagreement in the Friedkin-Johnsen opinion dynamics model under incomplete information. Unlike prior work that assumes a static setting with full knowledge of agents' innate opinions, we address the more realistic online setting where innate opinions are unknown and must be learned through sequential observations. This novel setting, which naturally mirrors periodic interventions on social media platforms, is formulated as a regret minimization problem, establishing a key connection between algorithmic interventions on social media platforms and the theory of multi-armed bandits. In our formulation, a learner observes only a scalar feedback of the overall polarization and disagreement after an intervention. For this novel bandit problem, we propose a two-stage algorithm based on low-rank matrix bandits. The algorithm first performs subspace estimation to identify an underlying low-dimensional structure, and then employs a linear bandit algorithm within the compact dimensional representation derived from the estimated subspace. We show that our algorithm achieves the cumulative regret of $\widetilde{\mathcal{O}}\big(\max(\tfrac{1}κ,\sqrt{|V|})\sqrt{|V|T}\big)$ over time horizon $T$, where $V$ is the set of agents and $κ$ is a parameter dependent on the diversity of interventions. Empirical results validate that our algorithm significantly outperforms a linear bandit baseline in terms of both cumulative regret and running time. △ Less

Submitted 6 March, 2026; v1 submitted 1 October, 2025; originally announced October 2025.

Comments: Accepted at ICLR 2026

arXiv:2507.18238 [pdf, ps, other]

Program Logics via Distributive Monoidal Categories

Authors: Filippo Bonchi, Elena Di Lavore, Mario Román, Sam Staton

Abstract: We derive multiple program logics, including correctness, incorrectness, and relational Hoare logic, from the axioms of imperative categories: uniformly traced distributive copy-discard categories. We introduce an internal language for imperative multicategories, on top of which we derive combinators for an adaptation of Dijkstra's guarded command language. Rules of program logics are derived from… ▽ More We derive multiple program logics, including correctness, incorrectness, and relational Hoare logic, from the axioms of imperative categories: uniformly traced distributive copy-discard categories. We introduce an internal language for imperative multicategories, on top of which we derive combinators for an adaptation of Dijkstra's guarded command language. Rules of program logics are derived from this internal language. △ Less

Submitted 24 July, 2025; originally announced July 2025.

Comments: 52 pages, including appendix

MSC Class: 18M50

arXiv:2506.10586 [pdf, ps, other]

Size-adaptive Hypothesis Testing for Fairness

Authors: Antonio Ferrara, Francesco Cozzi, Alan Perotti, André Panisson, Francesco Bonchi

Abstract: Determining whether an algorithmic decision-making system discriminates against a specific demographic typically involves comparing a single point estimate of a fairness metric against a predefined threshold. This practice is statistically brittle: it ignores sampling error and treats small demographic subgroups the same as large ones. The problem intensifies in intersectional analyses, where mult… ▽ More Determining whether an algorithmic decision-making system discriminates against a specific demographic typically involves comparing a single point estimate of a fairness metric against a predefined threshold. This practice is statistically brittle: it ignores sampling error and treats small demographic subgroups the same as large ones. The problem intensifies in intersectional analyses, where multiple sensitive attributes are considered jointly, giving rise to a larger number of smaller groups. As these groups become more granular, the data representing them becomes too sparse for reliable estimation, and fairness metrics yield excessively wide confidence intervals, precluding meaningful conclusions about potential unfair treatments. In this paper, we introduce a unified, size-adaptive, hypothesis-testing framework that turns fairness assessment into an evidence-based statistical decision. Our contribution is twofold. (i) For sufficiently large subgroups, we prove a Central-Limit result for the statistical parity difference, leading to analytic confidence intervals and a Wald test whose type-I (false positive) error is guaranteed at level $α$. (ii) For the long tail of small intersectional groups, we derive a fully Bayesian Dirichlet-multinomial estimator; Monte-Carlo credible intervals are calibrated for any sample size and naturally converge to Wald intervals as more data becomes available. We validate our approach empirically on benchmark datasets, demonstrating how our tests provide interpretable, statistically rigorous decisions under varying degrees of data availability and intersectionality. △ Less

Submitted 19 March, 2026; v1 submitted 12 June, 2025; originally announced June 2025.

Journal ref: 39th Conference on Neural Information Processing Systems (NeurIPS 2025)

arXiv:2505.23437 [pdf, other]

Bounded-Abstention Pairwise Learning to Rank

Authors: Antonio Ferrara, Andrea Pugnana, Francesco Bonchi, Salvatore Ruggieri

Abstract: Ranking systems influence decision-making in high-stakes domains like health, education, and employment, where they can have substantial economic and social impacts. This makes the integration of safety mechanisms essential. One such mechanism is $\textit{abstention}$, which enables algorithmic decision-making system to defer uncertain or low-confidence decisions to human experts. While abstention… ▽ More Ranking systems influence decision-making in high-stakes domains like health, education, and employment, where they can have substantial economic and social impacts. This makes the integration of safety mechanisms essential. One such mechanism is $\textit{abstention}$, which enables algorithmic decision-making system to defer uncertain or low-confidence decisions to human experts. While abstention have been predominantly explored in the context of classification tasks, its application to other machine learning paradigms remains underexplored. In this paper, we introduce a novel method for abstention in pairwise learning-to-rank tasks. Our approach is based on thresholding the ranker's conditional risk: the system abstains from making a decision when the estimated risk exceeds a predefined threshold. Our contributions are threefold: a theoretical characterization of the optimal abstention strategy, a model-agnostic, plug-in algorithm for constructing abstaining ranking models, and a comprehensive empirical evaluations across multiple datasets, demonstrating the effectiveness of our approach. △ Less

Submitted 29 May, 2025; originally announced May 2025.

arXiv:2505.11396 [pdf, ps, other]

doi 10.1145/3711896.3736960

Finding Counterfactual Evidences for Node Classification

Authors: Dazhuo Qiu, Jinwen Chen, Arijit Khan, Yan Zhao, Francesco Bonchi

Abstract: Counterfactual learning is emerging as an important paradigm, rooted in causality, which promises to alleviate common issues of graph neural networks (GNNs), such as fairness and interpretability. However, as in many real-world application domains where conducting randomized controlled trials is impractical, one has to rely on available observational (factual) data to detect counterfactuals. In th… ▽ More Counterfactual learning is emerging as an important paradigm, rooted in causality, which promises to alleviate common issues of graph neural networks (GNNs), such as fairness and interpretability. However, as in many real-world application domains where conducting randomized controlled trials is impractical, one has to rely on available observational (factual) data to detect counterfactuals. In this paper, we introduce and tackle the problem of searching for counterfactual evidences for the GNN-based node classification task. A counterfactual evidence is a pair of nodes such that, regardless they exhibit great similarity both in the features and in their neighborhood subgraph structures, they are classified differently by the GNN. We develop effective and efficient search algorithms and a novel indexing solution that leverages both node features and structural information to identify counterfactual evidences, and generalizes beyond any specific GNN. Through various downstream applications, we demonstrate the potential of counterfactual evidences to enhance fairness and accuracy of GNNs. △ Less

Submitted 2 June, 2025; v1 submitted 16 May, 2025; originally announced May 2025.

Comments: Accepted by KDD 2025

arXiv:2505.05306 [pdf, ps, other]

The calculus of neo-Peircean relations

Authors: Filippo Bonchi, Alessandro Di Giorgio, Nathan Haydon, Pawel Sobocinski

Abstract: The calculus of relations was introduced by De Morgan and Peirce during the second half of the 19th century, as an extension of Boole's algebra of classes. Later developments on quantification theory by Frege and Peirce himself, paved the way to what is known today as first-order logic, causing the calculus of relations to be long forgotten. This was until 1941, when Tarski raised the question on… ▽ More The calculus of relations was introduced by De Morgan and Peirce during the second half of the 19th century, as an extension of Boole's algebra of classes. Later developments on quantification theory by Frege and Peirce himself, paved the way to what is known today as first-order logic, causing the calculus of relations to be long forgotten. This was until 1941, when Tarski raised the question on the existence of a complete axiomatisation for it. This question found only negative answers: there is no finite axiomatisation for the calculus of relations and many of its fragments, as shown later by several no-go theorems. In this paper we show that -- by moving from traditional syntax (cartesian) to a diagrammatic one (monoidal) -- it is possible to have complete axiomatisations for the full calculus. The no-go theorems are circumvented by the fact that our calculus, named the calculus of neo-Peircean relations, is more expressive than the calculus of relations and, actually, as expressive as first-order logic. The axioms are obtained by combining two well known categorical structures: cartesian and linear bicategories. △ Less

Submitted 9 April, 2026; v1 submitted 8 May, 2025; originally announced May 2025.

Comments: arXiv admin note: substantial text overlap with arXiv:2401.07055

arXiv:2503.22819 [pdf, other]

Tape Diagrams for Monoidal Monads

Authors: Filippo Bonchi, Cipriano Junior Cioffo, Alessandro Di Giorgio, Elena Di Lavore

Abstract: Tape diagrams provide a graphical representation for arrows of rig categories, namely categories equipped with two monoidal structures, $\oplus$ and $\otimes$, where $\otimes$ distributes over $\oplus$. However, their applicability is limited to categories where $\oplus$ is a biproduct, i.e., both a categorical product and a coproduct. In this work, we extend tape diagrams to deal with Kleisli cat… ▽ More Tape diagrams provide a graphical representation for arrows of rig categories, namely categories equipped with two monoidal structures, $\oplus$ and $\otimes$, where $\otimes$ distributes over $\oplus$. However, their applicability is limited to categories where $\oplus$ is a biproduct, i.e., both a categorical product and a coproduct. In this work, we extend tape diagrams to deal with Kleisli categories of symmetric monoidal monads, presented by algebraic theories. △ Less

Submitted 28 March, 2025; originally announced March 2025.

Comments: Submission under review

arXiv:2503.01942 [pdf, other]

Mathematical Foundation of Interpretable Equivariant Surrogate Models

Authors: Jacopo Joy Colombini, Filippo Bonchi, Francesco Giannini, Fosca Giannotti, Roberto Pellungrini, Patrizio Frosini

Abstract: This paper introduces a rigorous mathematical framework for neural network explainability, and more broadly for the explainability of equivariant operators called Group Equivariant Operators (GEOs) based on Group Equivariant Non-Expansive Operators (GENEOs) transformations. The central concept involves quantifying the distance between GEOs by measuring the non-commutativity of specific diagrams. A… ▽ More This paper introduces a rigorous mathematical framework for neural network explainability, and more broadly for the explainability of equivariant operators called Group Equivariant Operators (GEOs) based on Group Equivariant Non-Expansive Operators (GENEOs) transformations. The central concept involves quantifying the distance between GEOs by measuring the non-commutativity of specific diagrams. Additionally, the paper proposes a definition of interpretability of GEOs according to a complexity measure that can be defined according to each user preferences. Moreover, we explore the formal properties of this framework and show how it can be applied in classical machine learning scenarios, like image classification with convolutional neural networks. △ Less

Submitted 3 March, 2025; originally announced March 2025.

arXiv:2501.16076 [pdf, other]

Minimizing Polarization and Disagreement in the Friedkin-Johnsen Model with Unknown Innate Opinions

Authors: Federico Cinus, Atsushi Miyauchi, Yuko Kuroki, Francesco Bonchi

Abstract: The bulk of the literature on opinion optimization in social networks adopts the Friedkin-Johnsen (FJ) opinion dynamics model, in which the innate opinions of all nodes are known: this is an unrealistic assumption. In this paper, we study opinion optimization under the FJ model without the full knowledge of innate opinions. Specifically, we borrow from the literature a series of objective function… ▽ More The bulk of the literature on opinion optimization in social networks adopts the Friedkin-Johnsen (FJ) opinion dynamics model, in which the innate opinions of all nodes are known: this is an unrealistic assumption. In this paper, we study opinion optimization under the FJ model without the full knowledge of innate opinions. Specifically, we borrow from the literature a series of objective functions, aimed at minimizing polarization and/or disagreement, and we tackle the budgeted optimization problem, where we can query the innate opinions of only a limited number of nodes. Given the complexity of our problem, we propose a framework based on three steps: (1) select the limited number of nodes we query, (2) reconstruct the innate opinions of all nodes based on those queried, and (3) optimize the objective function with the reconstructed opinions. For each step of the framework, we present and systematically evaluate several effective strategies. A key contribution of our work is a rigorous error propagation analysis that quantifies how reconstruction errors in innate opinions impact the quality of the final solutions. Our experiments on various synthetic and real-world datasets show that we can effectively minimize polarization and disagreement even if we have quite limited information about innate opinions. △ Less

Submitted 28 January, 2025; v1 submitted 27 January, 2025; originally announced January 2025.

arXiv:2411.13187 [pdf, ps, other]

Engagement-Driven Content Generation with Large Language Models

Authors: Erica Coppolillo, Federico Cinus, Marco Minici, Francesco Bonchi, Giuseppe Manco

Abstract: Large Language Models (LLMs) demonstrate significant persuasive capabilities in one-on-one interactions, but their influence within social networks, where interconnected users and complex opinion dynamics pose unique challenges, remains underexplored. This paper addresses the research question: \emph{Can LLMs generate meaningful content that maximizes user engagement on social networks?} To answ… ▽ More Large Language Models (LLMs) demonstrate significant persuasive capabilities in one-on-one interactions, but their influence within social networks, where interconnected users and complex opinion dynamics pose unique challenges, remains underexplored. This paper addresses the research question: \emph{Can LLMs generate meaningful content that maximizes user engagement on social networks?} To answer this, we propose a pipeline using reinforcement learning with simulated feedback, where the network's response to LLM-generated content (i.e., the reward) is simulated through a formal engagement model. This approach bypasses the temporal cost and complexity of live experiments, enabling an efficient feedback loop between the LLM and the network under study. It also allows to control over endogenous factors such as the LLM's position within the social network and the distribution of opinions on a given topic. Our approach is adaptive to the opinion distribution of the underlying network and agnostic to the specifics of the engagement model, which is embedded as a plug-and-play component. Such flexibility makes it suitable for more complex engagement tasks and interventions in computational social science. Using our framework, we analyze the performance of LLMs in generating social engagement under different conditions, showcasing their full potential in this task. The experimental code is publicly available at https://github.com/mminici/Engagement-Driven-Content-Generation. △ Less

Submitted 12 June, 2025; v1 submitted 20 November, 2024; originally announced November 2024.

arXiv:2410.10627 [pdf, ps, other]

Effectful Mealy Machines

Authors: Filippo Bonchi, Elena Di Lavore, Mario Román

Abstract: Effectful Mealy machines, which we introduce, are a generalization of Mealy machines with global effects determined by an effectful triple. We provide semantics of effectful Mealy machines in terms of both bisimilarity and traces: bisimilarity is characterized syntactically, via uniform feedback; traces are constructed coinductively in terms of streams. We prove that this framework characterizes s… ▽ More Effectful Mealy machines, which we introduce, are a generalization of Mealy machines with global effects determined by an effectful triple. We provide semantics of effectful Mealy machines in terms of both bisimilarity and traces: bisimilarity is characterized syntactically, via uniform feedback; traces are constructed coinductively in terms of streams. We prove that this framework characterizes standard causal processes and existing flavours of Mealy machine, bisimilarity, and trace equivalence. In the commutative case, we introduce a monoidal generalization of Raney's causal functions: monoidal causal processes. △ Less

Submitted 23 December, 2025; v1 submitted 14 October, 2024; originally announced October 2024.

Comments: Conference version of "Effectful Mealy Machines: Bisimulation and Trace" (arXiv:2410.10627v2). 52 pages

MSC Class: 18M35

arXiv:2410.03561 [pdf, ps, other]

A Diagrammatic Algebra for Program Logics

Authors: Filippo Bonchi, Alessandro Di Giorgio, Elena Di Lavore

Abstract: Tape diagrams provide a convenient notation for arrows of rig categories, i.e., categories equipped with two monoidal products, $\oplus$ and $\otimes$, where $\otimes$ distributes over $\oplus $. In this work, we extend tape diagrams with traces over $\oplus$ in order to deal with iteration in imperative programming languages. More precisely, we introduce Kleene-Cartesian bicategories, namely rig… ▽ More Tape diagrams provide a convenient notation for arrows of rig categories, i.e., categories equipped with two monoidal products, $\oplus$ and $\otimes$, where $\otimes$ distributes over $\oplus $. In this work, we extend tape diagrams with traces over $\oplus$ in order to deal with iteration in imperative programming languages. More precisely, we introduce Kleene-Cartesian bicategories, namely rig categories where the monoidal structure provided by $\otimes$ is a cartesian bicategory, while the one provided by $\oplus$ is what we name a Kleene bicategory. We show that the associated language of tape diagrams is expressive enough to deal with imperative programs and the corresponding laws provide a proof system that is at least as powerful as the one of Hoare logic. △ Less

Submitted 4 October, 2024; originally announced October 2024.

Comments: arXiv admin note: text overlap with arXiv:2210.09950

arXiv:2409.16478 [pdf, other]

Algorithmic Drift: A Simulation Framework to Study the Effects of Recommender Systems on User Preferences

Authors: Erica Coppolillo, Simone Mungari, Ettore Ritacco, Francesco Fabbri, Marco Minici, Francesco Bonchi, Giuseppe Manco

Abstract: Digital platforms such as social media and e-commerce websites adopt Recommender Systems to provide value to the user. However, the social consequences deriving from their adoption are still unclear. Many scholars argue that recommenders may lead to detrimental effects, such as bias-amplification deriving from the feedback loop between algorithmic suggestions and users' choices. Nonetheless, the e… ▽ More Digital platforms such as social media and e-commerce websites adopt Recommender Systems to provide value to the user. However, the social consequences deriving from their adoption are still unclear. Many scholars argue that recommenders may lead to detrimental effects, such as bias-amplification deriving from the feedback loop between algorithmic suggestions and users' choices. Nonetheless, the extent to which recommenders influence changes in users leaning remains uncertain. In this context, it is important to provide a controlled environment for evaluating the recommendation algorithm before deployment. To address this, we propose a stochastic simulation framework that mimics user-recommender system interactions in a long-term scenario. In particular, we simulate the user choices by formalizing a user model, which comprises behavioral aspects, such as the user resistance towards the recommendation algorithm and their inertia in relying on the received suggestions. Additionally, we introduce two novel metrics for quantifying the algorithm's impact on user preferences, specifically in terms of drift over time. We conduct an extensive evaluation on multiple synthetic datasets, aiming at testing the robustness of our framework when considering different scenarios and hyper-parameters setting. The experimental results prove that the proposed methodology is effective in detecting and quantifying the drift over the users preferences by means of the simulation. All the code and data used to perform the experiments are publicly available. △ Less

Submitted 24 September, 2024; originally announced September 2024.

arXiv:2407.15643 [pdf, other]

doi 10.1145/3627673.3679786

Link Polarity Prediction from Sparse and Noisy Labels via Multiscale Social Balance

Authors: Marco Minici, Federico Cinus, Francesco Bonchi, Giuseppe Manco

Abstract: Signed Graph Neural Networks (SGNNs) have recently gained attention as an effective tool for several learning tasks on signed networks, i.e., graphs where edges have an associated polarity. One of these tasks is to predict the polarity of the links for which this information is missing, starting from the network structure and the other available polarities. However, when the available polarities a… ▽ More Signed Graph Neural Networks (SGNNs) have recently gained attention as an effective tool for several learning tasks on signed networks, i.e., graphs where edges have an associated polarity. One of these tasks is to predict the polarity of the links for which this information is missing, starting from the network structure and the other available polarities. However, when the available polarities are few and potentially noisy, such a task becomes challenging. In this work, we devise a semi-supervised learning framework that builds around the novel concept of \emph{multiscale social balance} to improve the prediction of link polarities in settings characterized by limited data quantity and quality. Our model-agnostic approach can seamlessly integrate with any SGNN architecture, dynamically reweighting the importance of each data sample while making strategic use of the structural information from unlabeled edges combined with social balance theory. Empirical validation demonstrates that our approach outperforms established baseline models, effectively addressing the limitations imposed by noisy and sparse data. This result underlines the benefits of incorporating multiscale social balance into SGNNs, opening new avenues for robust and accurate predictions in signed network analysis. △ Less

Submitted 22 July, 2024; originally announced July 2024.

arXiv:2404.18795 [pdf, other]

When Lawvere meets Peirce: an equational presentation of boolean hyperdoctrines

Authors: Filippo Bonchi, Alessandro Di Giorgio, Davide Trotta

Abstract: Fo-bicategories are a categorification of Peirce's calculus of relations. Notably, their laws provide a proof system for first-order logic that is both purely equational and complete. This paper illustrates a correspondence between fo-bicategories and Lawvere's hyperdoctrines. To streamline our proof, we introduce peircean bicategories, which offer a more succinct characterization of fo-bicategori… ▽ More Fo-bicategories are a categorification of Peirce's calculus of relations. Notably, their laws provide a proof system for first-order logic that is both purely equational and complete. This paper illustrates a correspondence between fo-bicategories and Lawvere's hyperdoctrines. To streamline our proof, we introduce peircean bicategories, which offer a more succinct characterization of fo-bicategories. △ Less

Submitted 29 April, 2024; originally announced April 2024.

arXiv:2404.16676 [pdf, ps, other]

Multilayer Correlation Clustering

Authors: Atsushi Miyauchi, Florian Adriaens, Francesco Bonchi, Nikolaj Tatti

Abstract: In this paper, we establish Multilayer Correlation Clustering, a novel generalization of Correlation Clustering (Bansal et al., FOCS '02) to the multilayer setting. In this model, we are given a series of inputs of Correlation Clustering (called layers) over the common set $V$. The goal is then to find a clustering of $V$ that minimizes the $\ell_p$-norm ($p\geq 1$) of the disagreements vector, wh… ▽ More In this paper, we establish Multilayer Correlation Clustering, a novel generalization of Correlation Clustering (Bansal et al., FOCS '02) to the multilayer setting. In this model, we are given a series of inputs of Correlation Clustering (called layers) over the common set $V$. The goal is then to find a clustering of $V$ that minimizes the $\ell_p$-norm ($p\geq 1$) of the disagreements vector, which is defined as the vector (with dimension equal to the number of layers), each element of which represents the disagreements of the clustering on the corresponding layer. For this generalization, we first design an $O(L\log n)$-approximation algorithm, where $L$ is the number of layers, based on the well-known region growing technique. We then study an important special case of our problem, namely the problem with the probability constraint. For this case, we first give an $(α+2)$-approximation algorithm, where $α$ is any possible approximation ratio for the single-layer counterpart. For instance, we can take $α=2.5$ in general (Ailon et al., JACM '08) and $α=1.73+ε$ for the unweighted case (Cohen-Addad et al., FOCS '23). Furthermore, we design a $4$-approximation algorithm, which improves the above approximation ratio of $α+2=4.5$ for the general probability-constraint case. Computational experiments using real-world datasets demonstrate the effectiveness of our proposed algorithms. △ Less

Submitted 25 April, 2024; originally announced April 2024.

arXiv:2402.07718 [pdf, other]

Local Centrality Minimization with Quality Guarantees

Authors: Atsushi Miyauchi, Lorenzo Severini, Francesco Bonchi

Abstract: Centrality measures, quantifying the importance of vertices or edges, play a fundamental role in network analysis. To date, triggered by some positive approximability results, a large body of work has been devoted to studying centrality maximization, where the goal is to maximize the centrality score of a target vertex by manipulating the structure of a given network. On the other hand, due to the… ▽ More Centrality measures, quantifying the importance of vertices or edges, play a fundamental role in network analysis. To date, triggered by some positive approximability results, a large body of work has been devoted to studying centrality maximization, where the goal is to maximize the centrality score of a target vertex by manipulating the structure of a given network. On the other hand, due to the lack of such results, only very little attention has been paid to centrality minimization, despite its practical usefulness. In this study, we introduce a novel optimization model for local centrality minimization, where the manipulation is allowed only around the target vertex. We prove the NP-hardness of our model and that the most intuitive greedy algorithm has a quite limited performance in terms of approximation ratio. Then we design two effective approximation algorithms: The first algorithm is a highly-scalable algorithm that has an approximation ratio unachievable by the greedy algorithm, while the second algorithm is a bicriteria approximation algorithm that solves a continuous relaxation based on the Lovász extension, using a projected subgradient method. To the best of our knowledge, ours are the first polynomial-time algorithms with provable approximation guarantees for centrality minimization. Experiments using a variety of real-world networks demonstrate the effectiveness of our proposed algorithms: Our first algorithm is applicable to million-scale graphs and obtains much better solutions than those of scalable baselines, while our second algorithm is rather strong against adversarial instances. △ Less

Submitted 12 February, 2024; originally announced February 2024.

Comments: Accepted to The Web Conference 2024

arXiv:2402.01400 [pdf, other]

Query-Efficient Correlation Clustering with Noisy Oracle

Authors: Yuko Kuroki, Atsushi Miyauchi, Francesco Bonchi, Wei Chen

Abstract: We study a general clustering setting in which we have $n$ elements to be clustered, and we aim to perform as few queries as possible to an oracle that returns a noisy sample of the weighted similarity between two elements. Our setting encompasses many application domains in which the similarity function is costly to compute and inherently noisy. We introduce two novel formulations of online learn… ▽ More We study a general clustering setting in which we have $n$ elements to be clustered, and we aim to perform as few queries as possible to an oracle that returns a noisy sample of the weighted similarity between two elements. Our setting encompasses many application domains in which the similarity function is costly to compute and inherently noisy. We introduce two novel formulations of online learning problems rooted in the paradigm of Pure Exploration in Combinatorial Multi-Armed Bandits (PE-CMAB): fixed confidence and fixed budget settings. For both settings, we design algorithms that combine a sampling strategy with a classic approximation algorithm for correlation clustering and study their theoretical guarantees. Our results are the first examples of polynomial-time algorithms that work for the case of PE-CMAB in which the underlying offline optimization problem is NP-hard. △ Less

Submitted 3 November, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

Comments: Accepted to NeurIPS 2024

arXiv:2401.16088 [pdf, other]

Fairness in Algorithmic Recourse Through the Lens of Substantive Equality of Opportunity

Authors: Andrew Bell, Joao Fonseca, Carlo Abrate, Francesco Bonchi, Julia Stoyanovich

Abstract: Algorithmic recourse -- providing recommendations to those affected negatively by the outcome of an algorithmic system on how they can take action and change that outcome -- has gained attention as a means of giving persons agency in their interactions with artificial intelligence (AI) systems. Recent work has shown that even if an AI decision-making classifier is ``fair'' (according to some reaso… ▽ More Algorithmic recourse -- providing recommendations to those affected negatively by the outcome of an algorithmic system on how they can take action and change that outcome -- has gained attention as a means of giving persons agency in their interactions with artificial intelligence (AI) systems. Recent work has shown that even if an AI decision-making classifier is ``fair'' (according to some reasonable criteria), recourse itself may be unfair due to differences in the initial circumstances of individuals, compounding disparities for marginalized populations and requiring them to exert more effort than others. There is a need to define more methods and metrics for evaluating fairness in recourse that span a range of normative views of the world, and specifically those that take into account time. Time is a critical element in recourse because the longer it takes an individual to act, the more the setting may change due to model or data drift. This paper seeks to close this research gap by proposing two notions of fairness in recourse that are in normative alignment with substantive equality of opportunity, and that consider time. The first considers the (often repeated) effort individuals exert per successful recourse event, and the second considers time per successful recourse event. Building upon an agent-based framework for simulating recourse, this paper demonstrates how much effort is needed to overcome disparities in initial circumstances. We then proposes an intervention to improve the fairness of recourse by rewarding effort, and compare it to existing strategies. △ Less

Submitted 29 January, 2024; originally announced January 2024.

arXiv:2401.07055 [pdf, other]

Diagrammatic Algebra of First Order Logic

Authors: Filippo Bonchi, Alessandro Di Giorgio, Nathan Haydon, Pawel Sobocinski

Abstract: We introduce the calculus of neo-Peircean relations, a string diagrammatic extension of the calculus of binary relations that has the same expressivity as first order logic and comes with a complete axiomatisation. The axioms are obtained by combining two well known categorical structures: cartesian and linear bicategories. We introduce the calculus of neo-Peircean relations, a string diagrammatic extension of the calculus of binary relations that has the same expressivity as first order logic and comes with a complete axiomatisation. The axioms are obtained by combining two well known categorical structures: cartesian and linear bicategories. △ Less

Submitted 13 January, 2024; originally announced January 2024.

arXiv:2311.15756 [pdf, other]

doi 10.1109/TSP.2024.3401072

Learning Multi-Frequency Partial Correlation Graphs

Authors: Gabriele D'Acunto, Paolo Di Lorenzo, Francesco Bonchi, Stefania Sardellitti, Sergio Barbarossa

Abstract: Despite the large research effort devoted to learning dependencies between time series, the state of the art still faces a major limitation: existing methods learn partial correlations but fail to discriminate across distinct frequency bands. Motivated by many applications in which this differentiation is pivotal, we overcome this limitation by learning a block-sparse, frequency-dependent, partial… ▽ More Despite the large research effort devoted to learning dependencies between time series, the state of the art still faces a major limitation: existing methods learn partial correlations but fail to discriminate across distinct frequency bands. Motivated by many applications in which this differentiation is pivotal, we overcome this limitation by learning a block-sparse, frequency-dependent, partial correlation graph, in which layers correspond to different frequency bands, and partial correlations can occur over just a few layers. To this aim, we formulate and solve two nonconvex learning problems: the first has a closed-form solution and is suitable when there is prior knowledge about the number of partial correlations; the second hinges on an iterative solution based on successive convex approximation, and is effective for the general case where no prior knowledge is available. Numerical results on synthetic data show that the proposed methods outperform the current state of the art. Finally, the analysis of financial time series confirms that partial correlations exist only within a few frequency bands, underscoring how our methods enable the gaining of valuable insights that would be undetected without discriminating along the frequency domain. △ Less

Submitted 12 May, 2024; v1 submitted 27 November, 2023; originally announced November 2023.

Comments: Accepted at IEEE Transactions on Signal Processing

Journal ref: IEEE Transactions on Signal Processing, vol. 72, pp. 2953-2969, 2024

arXiv:2311.00118 [pdf, other]

Extracting the Multiscale Causal Backbone of Brain Dynamics

Authors: Gabriele D'Acunto, Francesco Bonchi, Gianmarco De Francisci Morales, Giovanni Petri

Abstract: The bulk of the research effort on brain connectivity revolves around statistical associations among brain regions, which do not directly relate to the causal mechanisms governing brain dynamics. Here we propose the multiscale causal backbone (MCB) of brain dynamics, shared by a set of individuals across multiple temporal scales, and devise a principled methodology to extract it. Our approach le… ▽ More The bulk of the research effort on brain connectivity revolves around statistical associations among brain regions, which do not directly relate to the causal mechanisms governing brain dynamics. Here we propose the multiscale causal backbone (MCB) of brain dynamics, shared by a set of individuals across multiple temporal scales, and devise a principled methodology to extract it. Our approach leverages recent advances in multiscale causal structure learning and optimizes the trade-off between the model fit and its complexity. Empirical assessment on synthetic data shows the superiority of our methodology over a baseline based on canonical functional connectivity networks. When applied to resting-state fMRI data, we find sparse MCBs for both the left and right brain hemispheres. Thanks to its multiscale nature, our approach shows that at low-frequency bands, causal dynamics are driven by brain regions associated with high-level cognitive functions; at higher frequencies instead, nodes related to sensory processing play a crucial role. Finally, our analysis of individual multiscale causal structures confirms the existence of a causal fingerprint of brain connectivity, thus supporting the existing extensive research in brain connectivity fingerprinting from a causal perspective. △ Less

Submitted 19 March, 2024; v1 submitted 31 October, 2023; originally announced November 2023.

Comments: Accepted at the 3rd conference on Causal Learning and Reasoning (CLeaR 2024)

arXiv:2309.06969 [pdf, other]

doi 10.1145/3617694.3623251

Setting the Right Expectations: Algorithmic Recourse Over Time

Authors: Joao Fonseca, Andrew Bell, Carlo Abrate, Francesco Bonchi, Julia Stoyanovich

Abstract: Algorithmic systems are often called upon to assist in high-stakes decision making. In light of this, algorithmic recourse, the principle wherein individuals should be able to take action against an undesirable outcome made by an algorithmic system, is receiving growing attention. The bulk of the literature on algorithmic recourse to-date focuses primarily on how to provide recourse to a single in… ▽ More Algorithmic systems are often called upon to assist in high-stakes decision making. In light of this, algorithmic recourse, the principle wherein individuals should be able to take action against an undesirable outcome made by an algorithmic system, is receiving growing attention. The bulk of the literature on algorithmic recourse to-date focuses primarily on how to provide recourse to a single individual, overlooking a critical element: the effects of a continuously changing context. Disregarding these effects on recourse is a significant oversight, since, in almost all cases, recourse consists of an individual making a first, unfavorable attempt, and then being given an opportunity to make one or several attempts at a later date - when the context might have changed. This can create false expectations, as initial recourse recommendations may become less reliable over time due to model drift and competition for access to the favorable outcome between individuals. In this work we propose an agent-based simulation framework for studying the effects of a continuously changing environment on algorithmic recourse. In particular, we identify two main effects that can alter the reliability of recourse for individuals represented by the agents: (1) competition with other agents acting upon recourse, and (2) competition with new agents entering the environment. Our findings highlight that only a small set of specific parameterizations result in algorithmic recourse that is reliable for agents over time. Consequently, we argue that substantial additional work is needed to understand recourse reliability over time, and to develop recourse methods that reward agents' effort. △ Less

Submitted 13 September, 2023; originally announced September 2023.

arXiv:2308.14486 [pdf, other]

doi 10.1145/3583780.3615025

Rebalancing Social Feed to Minimize Polarization and Disagreement

Authors: Federico Cinus, Aristides Gionis, Francesco Bonchi

Abstract: Social media have great potential for enabling public discourse on important societal issues. However, adverse effects, such as polarization and echo chambers, greatly impact the benefits of social media and call for algorithms that mitigate these effects. In this paper, we propose a novel problem formulation aimed at slightly nudging users' social feeds in order to strike a balance between releva… ▽ More Social media have great potential for enabling public discourse on important societal issues. However, adverse effects, such as polarization and echo chambers, greatly impact the benefits of social media and call for algorithms that mitigate these effects. In this paper, we propose a novel problem formulation aimed at slightly nudging users' social feeds in order to strike a balance between relevance and diversity, thus mitigating the emergence of polarization, without lowering the quality of the feed. Our approach is based on re-weighting the relative importance of the accounts that a user follows, so as to calibrate the frequency with which the content produced by various accounts is shown to the user. We analyze the convexity properties of the problem, demonstrating the non-matrix convexity of the objective function and the convexity of the feasible set. To efficiently address the problem, we develop a scalable algorithm based on projected gradient descent. We also prove that our problem statement is a proper generalization of the undirected-case problem so that our method can also be adopted for undirected social networks. As a baseline for comparison in the undirected case, we develop a semidefinite programming approach, which provides the optimal solution. Through extensive experiments on synthetic and real-world datasets, we validate the effectiveness of our approach, which outperforms non-trivial baselines, underscoring its ability to foster healthier and more cohesive online communities. △ Less

Submitted 28 August, 2023; originally announced August 2023.

Comments: Accepted for publication at ACM CIKM 2023

arXiv:2307.14849 [pdf, other]

Counterfactual Explanations for Graph Classification Through the Lenses of Density

Authors: Carlo Abrate, Giulia Preti, Francesco Bonchi

Abstract: Counterfactual examples have emerged as an effective approach to produce simple and understandable post-hoc explanations. In the context of graph classification, previous work has focused on generating counterfactual explanations by manipulating the most elementary units of a graph, i.e., removing an existing edge, or adding a non-existing one. In this paper, we claim that such language of explana… ▽ More Counterfactual examples have emerged as an effective approach to produce simple and understandable post-hoc explanations. In the context of graph classification, previous work has focused on generating counterfactual explanations by manipulating the most elementary units of a graph, i.e., removing an existing edge, or adding a non-existing one. In this paper, we claim that such language of explanation might be too fine-grained, and turn our attention to some of the main characterizing features of real-world complex networks, such as the tendency to close triangles, the existence of recurring motifs, and the organization into dense modules. We thus define a general density-based counterfactual search framework to generate instance-level counterfactual explanations for graph classifiers, which can be instantiated with different notions of dense substructures. In particular, we show two specific instantiations of this general framework: a method that searches for counterfactual graphs by opening or closing triangles, and a method driven by maximal cliques. We also discuss how the general method can be instantiated to exploit any other notion of dense substructures, including, for instance, a given taxonomy of nodes. We evaluate the effectiveness of our approaches in 7 brain network datasets and compare the counterfactual statements generated according to several widely-used metrics. Results confirm that adopting a semantic-relevant unit of change like density is essential to define versatile and interpretable counterfactual explanation methods. △ Less

Submitted 27 July, 2023; originally announced July 2023.

arXiv:2307.02817 [pdf, ps, other]

Exploiting Adjoints in Property Directed Reachability Analysis

Authors: Mayuko Kori, Flavio Ascari, Filippo Bonchi, Roberto Bruni, Roberta Gori, Ichiro Hasuo

Abstract: We formulate, in lattice-theoretic terms, two novel algorithms inspired by Bradley's property directed reachability algorithm. For finding safe invariants or counterexamples, the first algorithm exploits over-approximations of both forward and backward transition relations, expressed abstractly by the notion of adjoints. In the absence of adjoints, one can use the second algorithm, which exploits… ▽ More We formulate, in lattice-theoretic terms, two novel algorithms inspired by Bradley's property directed reachability algorithm. For finding safe invariants or counterexamples, the first algorithm exploits over-approximations of both forward and backward transition relations, expressed abstractly by the notion of adjoints. In the absence of adjoints, one can use the second algorithm, which exploits lower sets and their principals. As a notable example of application, we consider quantitative reachability problems for Markov Decision Processes. △ Less

Submitted 6 July, 2023; originally announced July 2023.

Comments: 44 pages, 11 figures, the full version of the paper accepted by CAV 2023

arXiv:2306.04828 [pdf, other]

doi 10.1145/3690624.3709301

Fast and Effective GNN Training through Sequences of Random Path Graphs

Authors: Francesco Bonchi, Claudio Gentile, Francesco Paolo Nerini, André Panisson, Fabio Vitale

Abstract: We present GERN, a novel scalable framework for training GNNs in node classification tasks, based on effective resistance, a standard tool in spectral graph theory. Our method progressively refines the GNN weights on a sequence of random spanning trees suitably transformed into path graphs which, despite their simplicity, are shown to retain essential topological and node information of the origin… ▽ More We present GERN, a novel scalable framework for training GNNs in node classification tasks, based on effective resistance, a standard tool in spectral graph theory. Our method progressively refines the GNN weights on a sequence of random spanning trees suitably transformed into path graphs which, despite their simplicity, are shown to retain essential topological and node information of the original input graph. The sparse nature of these path graphs substantially lightens the computational burden of GNN training. This not only enhances scalability but also improves accuracy in subsequent test phases, especially under small training set regimes, which are of great practical importance, as in many real-world scenarios labels may be hard to obtain. In these settings, our framework yields very good results as it effectively counters the training deterioration caused by overfitting when the training set is small. Our method also addresses common issues like over-squashing and over-smoothing while avoiding under-reaching phenomena. Although our framework is flexible and can be deployed in several types of GNNs, in this paper we focus on graph convolutional networks and carry out an extensive experimental investigation on a number of real-world graph benchmarks, where we achieve simultaneous improvement of training speed and test accuracy over a wide pool of representative baselines. △ Less

Submitted 24 February, 2025; v1 submitted 7 June, 2023; originally announced June 2023.

Comments: 16 pages, 8 figures; Accepted at KDD 2025

arXiv:2306.02696 [pdf, other]

Hyper-distance Oracles in Hypergraphs

Authors: Giulia Preti, Gianmarco De Francisci Morales, Francesco Bonchi

Abstract: We study point-to-point distance estimation in hypergraphs, where the query is parameterized by a positive integer s, which defines the required level of overlap for two hyperedges to be considered adjacent. To answer s-distance queries, we first explore an oracle based on the line graph of the given hypergraph and discuss its limitations: the main one is that the line graph is typically orders of… ▽ More We study point-to-point distance estimation in hypergraphs, where the query is parameterized by a positive integer s, which defines the required level of overlap for two hyperedges to be considered adjacent. To answer s-distance queries, we first explore an oracle based on the line graph of the given hypergraph and discuss its limitations: the main one is that the line graph is typically orders of magnitude larger than the original hypergraph. We then introduce HypED, a landmark-based oracle with a predefined size, built directly on the hypergraph, thus avoiding constructing the line graph. Our framework allows to approximately answer vertex-to-vertex, vertex-to-hyperedge, and hyperedge-to-hyperedge s-distance queries for any value of s. A key observation at the basis of our framework is that, as s increases, the hypergraph becomes more fragmented. We show how this can be exploited to improve the placement of landmarks, by identifying the s-connected components of the hypergraph. For this task, we devise an efficient algorithm based on the union-find technique and a dynamic inverted index. We experimentally evaluate HypED on several real-world hypergraphs and prove its versatility in answering s-distance queries for different values of s. Our framework allows answering such queries in fractions of a millisecond, while allowing fine-grained control of the trade-off between index size and approximation error at creation time. Finally, we prove the usefulness of the s-distance oracle in two applications, namely, hypergraph-based recommendation and the approximation of the s-closeness centrality of vertices and hyper-edges in the context of protein-to-protein interactions. △ Less

Submitted 19 March, 2024; v1 submitted 5 June, 2023; originally announced June 2023.

Comments: To appear in VLDBJ

arXiv:2303.14467 [pdf, other]

doi 10.1145/3653298

A Survey on the Densest Subgraph Problem and Its Variants

Authors: Tommaso Lanciano, Atsushi Miyauchi, Adriano Fazzone, Francesco Bonchi

Abstract: The Densest Subgraph Problem requires to find, in a given graph, a subset of vertices whose induced subgraph maximizes a measure of density. The problem has received a great deal of attention in the algorithmic literature since the early 1970s, with many variants proposed and many applications built on top of this basic definition. Recent years have witnessed a revival of research interest in this… ▽ More The Densest Subgraph Problem requires to find, in a given graph, a subset of vertices whose induced subgraph maximizes a measure of density. The problem has received a great deal of attention in the algorithmic literature since the early 1970s, with many variants proposed and many applications built on top of this basic definition. Recent years have witnessed a revival of research interest in this problem with several important contributions, including some groundbreaking results, published in 2022 and 2023. This survey provides a deep overview of the fundamental results and an exhaustive coverage of the many variants proposed in the literature, with a special attention to the most recent results. The survey also presents a comprehensive overview of applications and discusses some interesting open problems for this evergreen research topic. △ Less

Submitted 18 April, 2024; v1 submitted 25 March, 2023; originally announced March 2023.

Comments: Accepted to ACM Computing Surveys

arXiv:2211.00980 [pdf, other]

doi 10.48786/edbt.2024.01

Balancing Utility and Fairness in Submodular Maximization (Technical Report)

Authors: Yanhao Wang, Yuchen Li, Francesco Bonchi, Ying Wang

Abstract: Submodular function maximization is a fundamental combinatorial optimization problem with plenty of applications -- including data summarization, influence maximization, and recommendation. In many of these problems, the goal is to find a solution that maximizes the average utility over all users, for each of whom the utility is defined by a monotone submodular function. However, when the populati… ▽ More Submodular function maximization is a fundamental combinatorial optimization problem with plenty of applications -- including data summarization, influence maximization, and recommendation. In many of these problems, the goal is to find a solution that maximizes the average utility over all users, for each of whom the utility is defined by a monotone submodular function. However, when the population of users is composed of several demographic groups, another critical problem is whether the utility is fairly distributed across different groups. Although the \emph{utility} and \emph{fairness} objectives are both desirable, they might contradict each other, and, to the best of our knowledge, little attention has been paid to optimizing them jointly. To fill this gap, we propose a new problem called \emph{Bicriteria Submodular Maximization} (BSM) to balance utility and fairness. Specifically, it requires finding a fixed-size solution to maximize the utility function, subject to the value of the fairness function not being below a threshold. Since BSM is inapproximable within any constant factor, we focus on designing efficient instance-dependent approximation schemes. Our algorithmic proposal comprises two methods, with different approximation factors, obtained by converting a BSM instance into other submodular optimization problem instances. Using real-world and synthetic datasets, we showcase applications of our proposed methods in three submodular maximization problems: maximum coverage, influence maximization, and facility location. △ Less

Submitted 19 June, 2023; v1 submitted 2 November, 2022; originally announced November 2022.

Comments: 14 pages, 11 figures; accepted to EDBT 2024

arXiv:2210.17234 [pdf, other]

doi 10.1038/s41598-022-21720-4

The language of opinion change on social media under the lens of communicative action

Authors: Corrado Monti, Luca Maria Aiello, Gianmarco De Francisci Morales, Francesco Bonchi

Abstract: Which messages are more effective at inducing a change of opinion in the listener? We approach this question within the frame of Habermas' theory of communicative action, which posits that the illocutionary intent of the message (its pragmatic meaning) is the key. Thanks to recent advances in natural language processing, we are able to operationalize this theory by extracting the latent social dim… ▽ More Which messages are more effective at inducing a change of opinion in the listener? We approach this question within the frame of Habermas' theory of communicative action, which posits that the illocutionary intent of the message (its pragmatic meaning) is the key. Thanks to recent advances in natural language processing, we are able to operationalize this theory by extracting the latent social dimensions of a message, namely archetypes of social intent of language, that come from social exchange theory. We identify key ingredients to opinion change by looking at more than 46k posts and more than 3.5M comments on Reddit's r/ChangeMyView, a debate forum where people try to change each other's opinion and explicitly mark opinion-changing comments with a special flag called "delta". Comments that express no intent are about 77% less likely to change the mind of the recipient, compared to comments that convey at least one social dimension. Among the various social dimensions, the ones that are most likely to produce an opinion change are knowledge, similarity, and trust, which resonates with Habermas' theory of communicative action. We also find other new important dimensions, such as appeals to power or empathetic expressions of support. Finally, in line with theories of constructive conflict, yet contrary to the popular characterization of conflict as the bane of modern social media, our findings show that voicing conflict in the context of a structured public debate can promote integration, especially when it is used to counter another conflictive stance. By leveraging recent advances in natural language processing, our work provides an empirical framework for Habermas' theory, finds concrete examples of its effects in the wild, and suggests its possible extension with a more faceted understanding of intent interpreted as social dimensions of language. △ Less

Submitted 31 October, 2022; originally announced October 2022.

Comments: Main paper: 13 pages, 1 figure, 3 tables. Supplementary material: 9 pages, 6 figures, 8 tables

ACM Class: H.4.0; K.4.0

Journal ref: Nature Scientific Reports 12, 17920 (2022)

arXiv:2210.09950 [pdf, other]

doi 10.1145/3571257

Deconstructing the Calculus of Relations with Tape Diagrams

Authors: Filippo Bonchi, Alessandro Di Giorgio, Alessio Santamaria

Abstract: Rig categories with finite biproducts are categories with two monoidal products, where one is a biproduct and the other distributes over it. In this work we present tape diagrams, a sound and complete diagrammatic language for these categories, that can be intuitively thought as string diagrams of string diagrams. We test the effectiveness of our approach against the positive fragment of Tarski's… ▽ More Rig categories with finite biproducts are categories with two monoidal products, where one is a biproduct and the other distributes over it. In this work we present tape diagrams, a sound and complete diagrammatic language for these categories, that can be intuitively thought as string diagrams of string diagrams. We test the effectiveness of our approach against the positive fragment of Tarski's calculus of relations. △ Less

Submitted 18 October, 2022; originally announced October 2022.

Journal ref: Proceedings of the ACM on Programming Languages, Vol. 7, POPL 2023

arXiv:2208.14989 [pdf, other]

Learning Multiscale Non-stationary Causal Structures

Authors: Gabriele D'Acunto, Gianmarco De Francisci Morales, Paolo Bajardi, Francesco Bonchi

Abstract: This paper addresses a gap in the current state of the art by providing a solution for modeling causal relationships that evolve over time and occur at different time scales. Specifically, we introduce the multiscale non-stationary directed acyclic graph (MN-DAG), a framework for modeling multivariate time series data. Our contribution is twofold. Firstly, we expose a probabilistic generative mode… ▽ More This paper addresses a gap in the current state of the art by providing a solution for modeling causal relationships that evolve over time and occur at different time scales. Specifically, we introduce the multiscale non-stationary directed acyclic graph (MN-DAG), a framework for modeling multivariate time series data. Our contribution is twofold. Firstly, we expose a probabilistic generative model by leveraging results from spectral and causality theories. Our model allows sampling an MN-DAG according to user-specified priors on the time-dependence and multiscale properties of the causal graph. Secondly, we devise a Bayesian method named Multiscale Non-stationary Causal Structure Learner (MN-CASTLE) that uses stochastic variational inference to estimate MN-DAGs. The method also exploits information from the local partial correlation between time series over different time resolutions. The data generated from an MN-DAG reproduces well-known features of time series in different domains, such as volatility clustering and serial correlation. Additionally, we show the superior performance of MN-CASTLE on synthetic data with different multiscale and non-stationary properties compared to baseline models. Finally, we apply MN-CASTLE to identify the drivers of the natural gas prices in the US market. Causal relationships have strengthened during the COVID-19 outbreak and the Russian invasion of Ukraine, a fact that baseline methods fail to capture. MN-CASTLE identifies the causal impact of critical economic drivers on natural gas prices, such as seasonal factors, economic uncertainty, oil prices, and gas storage deviations. △ Less

Submitted 17 November, 2023; v1 submitted 31 August, 2022; originally announced August 2022.

Journal ref: Transactions on Machine Learning Research, 2023, ISSN 2835-8856

arXiv:2208.04620 [pdf, other]

Cascade-based Echo Chamber Detection

Authors: Marco Minici, Federico Cinus, Corrado Monti, Francesco Bonchi, Giuseppe Manco

Abstract: Despite echo chambers in social media have been under considerable scrutiny, general models for their detection and analysis are missing. In this work, we aim to fill this gap by proposing a probabilistic generative model that explains social media footprints -- i.e., social network structure and propagations of information -- through a set of latent communities, characterized by a degree of echo-… ▽ More Despite echo chambers in social media have been under considerable scrutiny, general models for their detection and analysis are missing. In this work, we aim to fill this gap by proposing a probabilistic generative model that explains social media footprints -- i.e., social network structure and propagations of information -- through a set of latent communities, characterized by a degree of echo-chamber behavior and by an opinion polarity. Specifically, echo chambers are modeled as communities that are permeable to pieces of information with similar ideological polarity, and impermeable to information of opposed leaning: this allows discriminating echo chambers from communities that lack a clear ideological alignment. To learn the model parameters we propose a scalable, stochastic adaptation of the Generalized Expectation Maximization algorithm, that optimizes the joint likelihood of observing social connections and information propagation. Experiments on synthetic data show that our algorithm is able to correctly reconstruct ground-truth latent communities with their degree of echo-chamber behavior and opinion polarity. Experiments on real-world data about polarized social and political debates, such as the Brexit referendum or the COVID-19 vaccine campaign, confirm the effectiveness of our proposal in detecting echo chambers. Finally, we show how our model can improve accuracy in auxiliary predictive tasks, such as stance detection and prediction of future propagations. △ Less

Submitted 9 August, 2022; originally announced August 2022.

Comments: Accepted for publication at ACM CIKM 2022

arXiv:2207.12196 [pdf, other]

On the Relation Between Opinion Change and Information Consumption on Reddit

Authors: Flavio Petruzzellis, Corrado Monti, Gianmarco De Francisci Morales, Francesco Bonchi

Abstract: While much attention has been devoted to the causes of opinion change, little is known about its consequences. Our study sheds a light on the relationship between one user's opinion change episode and subsequent behavioral change on an online social media, Reddit. In particular, we look at r/ChangeMyView, an online community dedicated to debating one's own opinions. Interestingly, this forum adopt… ▽ More While much attention has been devoted to the causes of opinion change, little is known about its consequences. Our study sheds a light on the relationship between one user's opinion change episode and subsequent behavioral change on an online social media, Reddit. In particular, we look at r/ChangeMyView, an online community dedicated to debating one's own opinions. Interestingly, this forum adopts a well-codified schema for explicitly self-reporting opinion change. Starting from this ground truth, we analyze changes in future online information consumption behavior that arise after a self-reported opinion change on sociopolitical topics; and in particular, operationalized in this work as the participation to sociopolitical subreddits. Such participation profile is important as it represents one's information diet, and is a reliable proxy for, e.g., political affiliation or health choices. We find that people who report an opinion change are significantly more likely to change their future participation in a specific subset of online communities. We characterize which communities are more likely to be abandoned after opinion change, and find a significant association (r=0.46) between propaganda-like language used in a community and the increase in chances of leaving it. We find comparable results (r=0.39) for the opposite direction, i.e., joining a community. This finding suggests how propagandistic communities act as a first gateway to internalize a shift in one's sociopolitical opinion. Finally, we show that the textual content of the discussion associated with opinion change is indicative of which communities are going to be subject to a participation change. In fact, a predictive model based only on the opinion change post is able to pinpoint these communities with an AP@5 of 0.20, similar to what can be reached by using all the past history of participation in communities. △ Less

Submitted 25 July, 2022; originally announced July 2022.

Comments: To appear in Proceedings of the International AAAI Conference on Web and Social Media (ICWSM 2023)

ACM Class: J.4; K.4

arXiv:2205.05052 [pdf, other]

On learning agent-based models from data

Authors: Corrado Monti, Marco Pangallo, Gianmarco De Francisci Morales, Francesco Bonchi

Abstract: Agent-Based Models (ABMs) are used in several fields to study the evolution of complex systems from micro-level assumptions. However, ABMs typically can not estimate agent-specific (or "micro") variables: this is a major limitation which prevents ABMs from harnessing micro-level data availability and which greatly limits their predictive power. In this paper, we propose a protocol to learn the lat… ▽ More Agent-Based Models (ABMs) are used in several fields to study the evolution of complex systems from micro-level assumptions. However, ABMs typically can not estimate agent-specific (or "micro") variables: this is a major limitation which prevents ABMs from harnessing micro-level data availability and which greatly limits their predictive power. In this paper, we propose a protocol to learn the latent micro-variables of an ABM from data. The first step of our protocol is to reduce an ABM to a probabilistic model, characterized by a computationally tractable likelihood. This reduction follows two general design principles: balance of stochasticity and data availability, and replacement of unobservable discrete choices with differentiable approximations. Then, our protocol proceeds by maximizing the likelihood of the latent variables via a gradient-based expectation maximization algorithm. We demonstrate our protocol by applying it to an ABM of the housing market, in which agents with different incomes bid higher prices to live in high-income neighborhoods. We demonstrate that the obtained model allows accurate estimates of the latent variables, while preserving the general behavior of the ABM. We also show that our estimates can be used for out-of-sample forecasting. Our protocol can be seen as an alternative to black-box data assimilation methods, that forces the modeler to lay bare the assumptions of the model, to think about the inferential process, and to spot potential identification problems. △ Less

Submitted 23 November, 2022; v1 submitted 10 May, 2022; originally announced May 2022.

arXiv:2202.08815 [pdf, other]

GRAPHSHAP: Explaining Identity-Aware Graph Classifiers Through the Language of Motifs

Authors: Alan Perotti, Paolo Bajardi, Francesco Bonchi, André Panisson

Abstract: Most methods for explaining black-box classifiers (e.g. on tabular data, images, or time series) rely on measuring the impact that removing/perturbing features has on the model output. This forces the explanation language to match the classifier's feature space. However, when dealing with graph data, in which the basic features correspond to the edges describing the graph structure, this matching… ▽ More Most methods for explaining black-box classifiers (e.g. on tabular data, images, or time series) rely on measuring the impact that removing/perturbing features has on the model output. This forces the explanation language to match the classifier's feature space. However, when dealing with graph data, in which the basic features correspond to the edges describing the graph structure, this matching between features space and explanation language might not be appropriate. Decoupling the feature space (edges) from a desired high-level explanation language (such as motifs) is thus a major challenge towards developing actionable explanations for graph classification tasks. In this paper we introduce GRAPHSHAP, a Shapley-based approach able to provide motif-based explanations for identity-aware graph classifiers, assuming no knowledge whatsoever about the model or its training data: the only requirement is that the classifier can be queried as a black-box at will. For the sake of computational efficiency we explore a progressive approximation strategy and show how a simple kernel can efficiently approximate explanation scores, thus allowing GRAPHSHAP to scale on scenarios with a large explanation space (i.e. large number of motifs). We showcase GRAPHSHAP on a real-world brain-network dataset consisting of patients affected by Autism Spectrum Disorder and a control group. Our experiments highlight how the classification provided by a black-box model can be effectively explained by few connectomics patterns. △ Less

Submitted 7 July, 2023; v1 submitted 17 February, 2022; originally announced February 2022.

Comments: Accepted by International Joint Conference on Neural Networks 2023 (IJCNN)

arXiv:2202.00640 [pdf, other]

doi 10.1145/3485447.3512143

Rewiring What-to-Watch-Next Recommendations to Reduce Radicalization Pathways

Authors: Francesco Fabbri, Yanhao Wang, Francesco Bonchi, Carlos Castillo, Michael Mathioudakis

Abstract: Recommender systems typically suggest to users content similar to what they consumed in the past. If a user happens to be exposed to strongly polarized content, she might subsequently receive recommendations which may steer her towards more and more radicalized content, eventually being trapped in what we call a "radicalization pathway". In this paper, we study the problem of mitigating radicaliza… ▽ More Recommender systems typically suggest to users content similar to what they consumed in the past. If a user happens to be exposed to strongly polarized content, she might subsequently receive recommendations which may steer her towards more and more radicalized content, eventually being trapped in what we call a "radicalization pathway". In this paper, we study the problem of mitigating radicalization pathways using a graph-based approach. Specifically, we model the set of recommendations of a "what-to-watch-next" recommender as a d-regular directed graph where nodes correspond to content items, links to recommendations, and paths to possible user sessions. We measure the "segregation" score of a node representing radicalized content as the expected length of a random walk from that node to any node representing non-radicalized content. High segregation scores are associated to larger chances to get users trapped in radicalization pathways. Hence, we define the problem of reducing the prevalence of radicalization pathways by selecting a small number of edges to "rewire", so to minimize the maximum of segregation scores among all radicalized nodes, while maintaining the relevance of the recommendations. We prove that the problem of finding the optimal set of recommendations to rewire is NP-hard and NP-hard to approximate within any factor. Therefore, we turn our attention to heuristics, and propose an efficient yet effective greedy algorithm based on the absorbing random walk theory. Our experiments on real-world datasets in the context of video and news recommendations confirm the effectiveness of our proposal. △ Less

Submitted 1 February, 2022; originally announced February 2022.

Comments: To appear in the Web conference 2022 (WWW '22)

arXiv:2201.08005 [pdf]

doi 10.1145/3485447.3512191

FreSCo: Mining Frequent Patterns in Simplicial Complexes

Authors: Giulia Preti, Gianmarco De Francisci Morales, Francesco Bonchi

Abstract: Simplicial complexes are a generalization of graphs that model higher-order relations. In this paper, we introduce simplicial patterns -- that we call simplets -- and generalize the task of frequent pattern mining from the realm of graphs to that of simplicial complexes. Our task is particularly challenging due to the enormous search space and the need for higher-order isomorphism. We show that fi… ▽ More Simplicial complexes are a generalization of graphs that model higher-order relations. In this paper, we introduce simplicial patterns -- that we call simplets -- and generalize the task of frequent pattern mining from the realm of graphs to that of simplicial complexes. Our task is particularly challenging due to the enormous search space and the need for higher-order isomorphism. We show that finding the occurrences of simplets in a complex can be reduced to a bipartite graph isomorphism problem, in linear time and at most quadratic space. We then propose an anti-monotonic frequency measure that allows us to start the exploration from small simplets and stop expanding a simplet as soon as its frequency falls below the minimum frequency threshold. Equipped with these ideas and a clever data structure, we develop a memory-conscious algorithm that, by carefully exploiting the relationships among the simplices in the complex and among the simplets, achieves efficiency and scalability for our complex mining task. Our algorithm, FreSCo, comes in two flavors: it can compute the exact frequency of the simplets or, more quickly, it can determine whether a simplet is frequent, without having to compute the exact frequency. Experimental results prove the ability of FreSCo to mine frequent simplets in complexes of various size and dimension, and the significance of the simplets with respect to the traditional graph patterns. △ Less

Submitted 26 January, 2022; v1 submitted 20 January, 2022; originally announced January 2022.

Comments: To appear at The Web Conference 2022

arXiv:2112.15488 [pdf, ps, other]

doi 10.1145/3494561

Multi-relation Graph Summarization

Authors: Xiangyu Ke, Arijit Khan, Francesco Bonchi

Abstract: Graph summarization is beneficial in a wide range of applications, such as visualization, interactive and exploratory analysis, approximate query processing, reducing the on-disk storage footprint, and graph processing in modern hardware. However, the bulk of the literature on graph summarization surprisingly overlooks the possibility of having edges of different types. In this paper, we study the… ▽ More Graph summarization is beneficial in a wide range of applications, such as visualization, interactive and exploratory analysis, approximate query processing, reducing the on-disk storage footprint, and graph processing in modern hardware. However, the bulk of the literature on graph summarization surprisingly overlooks the possibility of having edges of different types. In this paper, we study the novel problem of producing summaries of multi-relation networks, i.e., graphs where multiple edges of different types may exist between any pair of nodes. Multi-relation graphs are an expressive model of real-world activities, in which a relation can be a topic in social networks, an interaction type in genetic networks, or a snapshot in temporal graphs. The first approach that we consider for multi-relation graph summarization is a two-step method based on summarizing each relation in isolation, and then aggregating the resulting summaries in some clever way to produce a final unique summary. In doing this, as a side contribution, we provide the first polynomial-time approximation algorithm based on the k-Median clustering for the classic problem of lossless single-relation graph summarization. Then, we demonstrate the shortcomings of these two-step methods, and propose holistic approaches, both approximate and heuristic algorithms, to compute a summary directly for multi-relation graphs. In particular, we prove that the approximation bound of k-Median clustering for the single relation solution can be maintained in a multi-relation graph with proper aggregation operation over adjacency matrices corresponding to its multiple relations. Experimental results and case studies (on co-authorship networks and brain networks) validate the effectiveness and efficiency of the proposed algorithms. △ Less

Submitted 24 December, 2021; originally announced December 2021.

Comments: To appear, ACM TKDD

arXiv:2112.08237 [pdf, other]

Exposure Inequality in People Recommender Systems: The Long-Term Effects

Authors: Francesco Fabbri, Maria Luisa Croci, Francesco Bonchi, Carlos Castillo

Abstract: People recommender systems may affect the exposure that users receive in social networking platforms, influencing attention dynamics and potentially strengthening pre-existing inequalities that disproportionately affect certain groups. In this paper we introduce a model to simulate the feedback loop created by multiple rounds of interactions between users and a link recommender in a social netwo… ▽ More People recommender systems may affect the exposure that users receive in social networking platforms, influencing attention dynamics and potentially strengthening pre-existing inequalities that disproportionately affect certain groups. In this paper we introduce a model to simulate the feedback loop created by multiple rounds of interactions between users and a link recommender in a social network. This allows us to study the long-term consequences of those particular recommendation algorithms. Our model is equipped with several parameters to control (i) the level of homophily in the network, (ii) the relative size of the groups, (iii) the choice among several state-of-the-art link recommenders, and (iv) the choice among three different user behavior models, that decide which recommendations are accepted or rejected. Our extensive experimentation with the proposed model shows that a minority group, if homophilic enough, can get a disproportionate advantage in exposure from all link recommenders. Instead, when it is heterophilic, it gets under-exposed. Moreover, while the homophily level of the minority affects the speed of the growth of the disparate exposure, the relative size of the minority affects the magnitude of the effect. Finally, link recommenders strengthen exposure inequalities at the individual level, exacerbating the "rich-get-richer" effect: this happens for both the minority and the majority class and independently of their level of homophily. △ Less

Submitted 15 December, 2021; originally announced December 2021.

Comments: To appear in ICWSM 2022

arXiv:2112.03337 [pdf, other]

Dense and well-connected subgraph detection in dual networks

Authors: Tianyi Chen, Francesco Bonchi, David Garcia-Soriano, Atsushi Miyauchi, Charalampos E. Tsourakakis

Abstract: Dense subgraph discovery is a fundamental problem in graph mining with a wide range of applications \cite{gionis2015dense}. Despite a large number of applications ranging from computational neuroscience to social network analysis, that take as input a {\em dual} graph, namely a pair of graphs on the same set of nodes, dense subgraph discovery methods focus on a single graph input with few notable… ▽ More Dense subgraph discovery is a fundamental problem in graph mining with a wide range of applications \cite{gionis2015dense}. Despite a large number of applications ranging from computational neuroscience to social network analysis, that take as input a {\em dual} graph, namely a pair of graphs on the same set of nodes, dense subgraph discovery methods focus on a single graph input with few notable exceptions \cite{semertzidis2019finding,charikar2018finding,reinthal2016finding,jethava2015finding}. In this work, we focus the following problem: given a pair of graphs $G,H$ on the same set of nodes $V$, how do we find a subset of nodes $S \subseteq V$ that induces a well-connected subgraph in $G$ and a dense subgraph in $H$? Our formulation generalizes previous research on dual graphs \cite{Wu+15,WuZLFJZ16,Cui2018}, by enabling the {\em control} of the connectivity constraint on $G$. We propose a novel mathematical formulation based on $k$-edge connectivity, and prove that it is solvable exactly in polynomial time. We compare our method to state-of-the-art competitors; we find empirically that ranging the connectivity constraint enables the practitioner to obtain insightful information that is otherwise inaccessible. Finally, we show that our proposed mining tool can be used to better understand how users interact on Twitter, and connectivity aspects of human brain networks with and without Autism Spectrum Disorder (ASD). △ Less

Submitted 6 December, 2021; originally announced December 2021.

arXiv:2112.00626 [pdf, other]

The Effect of People Recommenders on Echo Chambers and Polarization

Authors: Federico Cinus, Marco Minici, Corrado Monti, Francesco Bonchi

Abstract: The effects of social media on critical issues, such as polarization and misinformation, are under scrutiny due to the disruptive consequences that these phenomena can have on our societies. Among the algorithms routinely used by social media platforms, people-recommender systems are of special interest, as they directly contribute to the evolution of the social network structure, affecting the in… ▽ More The effects of social media on critical issues, such as polarization and misinformation, are under scrutiny due to the disruptive consequences that these phenomena can have on our societies. Among the algorithms routinely used by social media platforms, people-recommender systems are of special interest, as they directly contribute to the evolution of the social network structure, affecting the information and the opinions users are exposed to. In this paper, we propose a framework to assess the effect of people recommenders on the evolution of opinions. Our proposal is based on Monte Carlo simulations combining link recommendation and opinion-dynamics models. In order to control initial conditions, we define a random network model to generate graphs with opinions, with tunable amounts of modularity and homophily. We join these elements into a methodology to study the effects of the recommender system on echo chambers and polarization. We also show how to use our framework to measure, by means of simulations, the impact of different intervention strategies. Our thorough experimentation shows that people recommenders can in fact lead to a significant increase in echo chambers. However, this happens only if there is considerable initial homophily in the network. Also, we find that if the network already contains echo chambers, the effect of the recommendation algorithm is negligible. Such findings are robust to two very different opinion dynamics models, a bounded confidence model and an epistemological model. △ Less

Submitted 1 December, 2021; originally announced December 2021.

Comments: To appear in: Proceedings of the International AAAI Conference on Web and Social Media, vol. 16 (ICWSM '22)

ACM Class: I.6; J.4

arXiv:2111.05072 [pdf, other]

doi 10.1145/3490354.3494370

The Evolving Causal Structure of Equity Risk Factors

Authors: Gabriele D'Acunto, Paolo Bajardi, Francesco Bonchi, Gianmarco De Francisci Morales

Abstract: In recent years, multi-factor strategies have gained increasing popularity in the financial industry, as they allow investors to have a better understanding of the risk drivers underlying their portfolios. Moreover, such strategies promise to promote diversification and thus limit losses in times of financial turmoil. However, recent studies have reported a significant level of redundancy between… ▽ More In recent years, multi-factor strategies have gained increasing popularity in the financial industry, as they allow investors to have a better understanding of the risk drivers underlying their portfolios. Moreover, such strategies promise to promote diversification and thus limit losses in times of financial turmoil. However, recent studies have reported a significant level of redundancy between these factors, which might enhance risk contagion among multi-factor portfolios during financial crises. Therefore, it is of fundamental importance to better understand the relationships among factors. Empowered by recent advances in causal structure learning methods, this paper presents a study of the causal structure of financial risk factors and its evolution over time. In particular, the data we analyze covers 11 risk factors concerning the US equity market, spanning a period of 29 years at daily frequency. Our results show a statistically significant sparsifying trend of the underlying causal structure. However, this trend breaks down during periods of financial stress, in which we can observe a densification of the causal network driven by a growth of the out-degree of the market factor node. Finally, we present a comparison with the analysis of factors cross-correlations, which further confirms the importance of causal analysis for gaining deeper insights in the dynamics of the factor system, particularly during economic downturns. Our findings are especially significant from a risk-management perspective. They link the evolution of the causal structure of equity risk factors with market volatility and a worsening macroeconomic environment, and show that, in times of financial crisis, exposure to different factors boils down to exposure to the market risk factor. △ Less

Submitted 9 November, 2021; originally announced November 2021.

Journal ref: ACM International Conference on AI in Finance, 2021

arXiv:2109.13589 [pdf, other]

doi 10.1145/3459637.3482444

Learning Ideological Embeddings from Information Cascades

Authors: Corrado Monti, Giuseppe Manco, Cigdem Aslay, Francesco Bonchi

Abstract: Modeling information cascades in a social network through the lenses of the ideological leaning of its users can help understanding phenomena such as misinformation propagation and confirmation bias, and devising techniques for mitigating their toxic effects. In this paper we propose a stochastic model to learn the ideological leaning of each user in a multidimensional ideological space, by anal… ▽ More Modeling information cascades in a social network through the lenses of the ideological leaning of its users can help understanding phenomena such as misinformation propagation and confirmation bias, and devising techniques for mitigating their toxic effects. In this paper we propose a stochastic model to learn the ideological leaning of each user in a multidimensional ideological space, by analyzing the way politically salient content propagates. In particular, our model assumes that information propagates from one user to another if both users are interested in the topic and ideologically aligned with each other. To infer the parameters of our model, we devise a gradient-based optimization procedure maximizing the likelihood of an observed set of information cascades. Our experiments on real-world political discussions on Twitter and Reddit confirm that our model is able to learn the political stance of the social media users in a multidimensional ideological space. △ Less

Submitted 28 September, 2021; originally announced September 2021.

Comments: Published in CIKM 2021

ACM Class: J.4; G.3

Journal ref: Proceedings of the 30th ACM International Conference on Information and Knowledge Management (CIKM 2021)

arXiv:2109.06049 [pdf, other]

String Diagram Rewrite Theory III: Confluence with and without Frobenius

Authors: Filippo Bonchi, Fabio Gadducci, Aleks Kissinger, Paweł Sobociński, Fabio Zanasi

Abstract: In this paper we address the problem of proving confluence for string diagram rewriting, which was previously shown to be characterised combinatorically as double-pushout rewriting with interfaces (DPOI) on (labelled) hypergraphs. For standard DPO rewriting without interfaces, confluence for terminating rewrite systems is, in general, undecidable. Nevertheless, we show here that confluence for DPO… ▽ More In this paper we address the problem of proving confluence for string diagram rewriting, which was previously shown to be characterised combinatorically as double-pushout rewriting with interfaces (DPOI) on (labelled) hypergraphs. For standard DPO rewriting without interfaces, confluence for terminating rewrite systems is, in general, undecidable. Nevertheless, we show here that confluence for DPOI, and hence string diagram rewriting, is decidable. We apply this result to give effective procedures for deciding local confluence of symmetric monoidal theories with and without Frobenius structure by critical pair analysis. For the latter, we introduce the new notion of path joinability for critical pairs, which enables finitely many joins of a critical pair to be lifted to an arbitrary context in spite of the strong non-local constraints placed on rewriting in a generic symmetric monoidal theory. △ Less

Submitted 18 April, 2022; v1 submitted 13 September, 2021; originally announced September 2021.

Showing 1–50 of 120 results for author: Bonchi, F