Skip to main content

Showing 1–9 of 9 results for author: Gamba, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2512.24162  [pdf, ps, other

    cs.CV

    Deep Probabilistic Supervision for Image Classification

    Authors: Anton Adelöw, Matteo Gamba, Atsuto Maki

    Abstract: Supervised training of deep neural networks for classification typically relies on hard targets, which promote overconfidence and can limit calibration, generalization, and robustness. Self-distillation methods aim to mitigate this by leveraging inter-class and sample-specific information present in the model's own predictions, but often remain dependent on hard targets without explicitly modeling… ▽ More

    Submitted 5 February, 2026; v1 submitted 30 December, 2025; originally announced December 2025.

    Comments: 16 pages, 12 figures

  2. arXiv:2508.10490  [pdf, ps, other

    cs.LG cs.AI cs.CV

    On the Complexity-Faithfulness Trade-off of Gradient-Based Explanations

    Authors: Amir Mehrpanah, Matteo Gamba, Kevin Smith, Hossein Azizpour

    Abstract: ReLU networks, while prevalent for visual data, have sharp transitions, sometimes relying on individual pixels for predictions, making vanilla gradient-based explanations noisy and difficult to interpret. Existing methods, such as GradCAM, smooth these explanations by producing surrogate models at the cost of faithfulness. We introduce a unifying spectral framework to systematically analyze and qu… ▽ More

    Submitted 14 August, 2025; originally announced August 2025.

    Comments: 23 pages, 14 figures, to be published in International Conference on Computer Vision 2025

  3. arXiv:2502.07783  [pdf, ps, other

    cs.LG

    Curvature Tuning: Provable Training-free Model Steering From a Single Parameter

    Authors: Leyang Hu, Matteo Gamba, Randall Balestriero

    Abstract: The scaling of model and data sizes has reshaped the AI landscape, establishing finetuning pretrained models as the standard paradigm for solving downstream tasks. However, dominant finetuning methods typically rely on weight adaptation, often lack interpretability, and depend on heuristically chosen hyperparameters. In this paper, we take a different perspective and shift the focus from weights t… ▽ More

    Submitted 15 January, 2026; v1 submitted 11 February, 2025; originally announced February 2025.

    Comments: Accepted at NeurIPS 2025

  4. arXiv:2301.12309  [pdf, ps, other

    cs.LG

    On the Lipschitz Constant of Deep Networks and Double Descent

    Authors: Matteo Gamba, Hossein Azizpour, Mårten Björkman

    Abstract: Existing bounds on the generalization error of deep networks assume some form of smooth or bounded dependence on the input variable, falling short of investigating the mechanisms controlling such factors in practice. In this work, we present an extensive experimental study of the empirical Lipschitz constant of deep networks undergoing double descent, and highlight non-monotonic trends strongly co… ▽ More

    Submitted 23 July, 2025; v1 submitted 28 January, 2023; originally announced January 2023.

  5. arXiv:2209.10080  [pdf, other

    cs.LG stat.ML

    Deep Double Descent via Smooth Interpolation

    Authors: Matteo Gamba, Erik Englesson, Mårten Björkman, Hossein Azizpour

    Abstract: The ability of overparameterized deep networks to interpolate noisy data, while at the same time showing good generalization performance, has been recently characterized in terms of the double descent curve for the test error. Common intuition from polynomial regression suggests that overparameterized networks are able to sharply interpolate noisy data, without considerably deviating from the grou… ▽ More

    Submitted 8 April, 2023; v1 submitted 20 September, 2022; originally announced September 2022.

  6. arXiv:2202.11749  [pdf, other

    cs.LG

    Are All Linear Regions Created Equal?

    Authors: Matteo Gamba, Adrian Chmielewski-Anders, Josephine Sullivan, Hossein Azizpour, Mårten Björkman

    Abstract: The number of linear regions has been studied as a proxy of complexity for ReLU networks. However, the empirical success of network compression techniques like pruning and knowledge distillation, suggest that in the overparameterized setting, linear regions density might fail to capture the effective nonlinearity. In this work, we propose an efficient algorithm for discovering linear regions and u… ▽ More

    Submitted 23 February, 2022; originally announced February 2022.

  7. arXiv:2003.07797  [pdf, other

    cs.CV cs.LG

    Hyperplane Arrangements of Trained ConvNets Are Biased

    Authors: Matteo Gamba, Stefan Carlsson, Hossein Azizpour, Mårten Björkman

    Abstract: We investigate the geometric properties of the functions learned by trained ConvNets in the preactivation space of their convolutional layers, by performing an empirical study of hyperplane arrangements induced by a convolutional layer. We introduce statistics over the weights of a trained network to study local arrangements and relate them to the training dynamics. We observe that trained ConvNet… ▽ More

    Submitted 14 April, 2023; v1 submitted 17 March, 2020; originally announced March 2020.

  8. arXiv:1512.09210  [pdf, other

    math.NA cond-mat.mes-hall cond-mat.stat-mech cs.CE

    Galerkin Methods for Boltzmann-Poisson transport with reflection conditions on rough boundaries

    Authors: Jose A. Morales Escalante, Irene M. Gamba

    Abstract: We consider in this paper the mathematical and numerical modelling of reflective boundary conditions (BC) associated to Boltzmann - Poisson systems, including diffusive reflection in addition to specularity, in the context of electron transport in semiconductor device modelling at nano scales, and their implementation in Discontinuous Galerkin (DG) schemes. We study these BC on the physical bounda… ▽ More

    Submitted 26 February, 2018; v1 submitted 30 December, 2015; originally announced December 2015.

    Comments: Paper accepted for publication in Journal of Computational Physics. -Conclusions section expanded -Title changed with respect to previous preprint version -New subsections related to simulations of 2D double gated MOSFET and comparison of bulk silicon with collisionless plasma under reflective and periodic boundary conditions

    Journal ref: Journal of Computational Physics 363C (2018) pp. 302-328

  9. arXiv:1512.05403  [pdf, other

    cs.CE cond-mat.mes-hall math.NA

    Discontinuous Galerkin Deterministic Solvers for a Boltzmann-Poisson Model of Hot Electron Transport by Averaged Empirical Pseudopotential Band Structures

    Authors: Jose Morales-Escalante, Irene M. Gamba, Yingda Cheng, Armando Majorana, Chi-Wang Shu, James Chelikowsky

    Abstract: The purpose of this work is to incorporate numerically, in a discontinuous Galerkin (DG) solver of a Boltzmann-Poisson model for hot electron transport, an electronic conduction band whose values are obtained by the spherical averaging of the full band structure given by a local empirical pseudopotential method (EPM) around a local minimum of the conduction band for silicon, as a midpoint between… ▽ More

    Submitted 17 January, 2018; v1 submitted 16 December, 2015; originally announced December 2015.

    Comments: submission to CMAME (Computer Methods in Applied Mechanics and Engineering) Journal as a reply to the reviewers on February 2017

    Journal ref: Computer Methods in Applied Mechanics and Engineering, Volume 321, 2017, Pages 209-234