Skip to main content

Showing 1–39 of 39 results for author: Vella, F

Searching in archive cs. Search in all archives.
.
  1. Communication-Avoiding SpGEMM via Trident Partitioning on Hierarchical GPU Interconnects

    Authors: Julian Bellavita, Lorenzo Pichetti, Thomas Pasquali, Flavio Vella, Giulia Guidi

    Abstract: The multiplication of two sparse matrices, known as SpGEMM, is a key kernel in scientific computing and large-scale data analytics, underpinning graph algorithms, machine learning, simulations, and computational biology, where sparsity is often highly unstructured. The unstructured sparsity makes achieving high performance challenging because it limits both memory efficiency and scalability. In di… ▽ More

    Submitted 24 March, 2026; v1 submitted 22 March, 2026; originally announced March 2026.

    Journal ref: 2026 International Conference on Supercomputing (ICS '26), July 06--09, 2026, Belfast, United Kingdom

  2. arXiv:2603.16105  [pdf, ps, other

    cs.CL cs.AI

    Frequency Matters: Fast Model-Agnostic Data Curation for Pruning and Quantization

    Authors: Francesco Pio Monaco, Elia Cunegatti, Flavio Vella, Giovanni Iacca

    Abstract: Post-training model compression is essential for enhancing the portability of Large Language Models (LLMs) while preserving their performance. While several compression approaches have been proposed, less emphasis has been placed on selecting the most suitable set of data (the so-called \emph{calibration data}) for finding the compressed model configuration. The choice of calibration data is a cri… ▽ More

    Submitted 7 April, 2026; v1 submitted 17 March, 2026; originally announced March 2026.

    Comments: Added in appendix an expanded multilingual study. 19 pages

    ACM Class: I.2.7

  3. arXiv:2601.19413  [pdf, ps, other

    cs.NI cs.DC

    NET4EXA: Pioneering the Future of Interconnects for Supercomputing and AI

    Authors: Michele Martinelli, Roberto Ammendola, Andrea Biagioni, Carlotta Chiarini, Ottorino Frezza, Francesca Lo Cicero, Alessandro Lonardo, Pier Stanislao Paolucci, Elena Pastorelli, Pierpaolo Perticaroli, Luca Pontisso, Cristian Rossi, Francesco Simula, Piero Vicini, David Colin, Grégoire Pichon, Alexandre Louvet, John Gliksberg, Claire Chen, Matteo Turisini, Andrea Monterubbiano, Jean-Philippe Nominé, Denis Dutoit, Hugo Taboada, Lilia Zaourar , et al. (19 additional authors not shown)

    Abstract: NET4EXA aims to develop a next-generation high-performance interconnect for HPC and AI systems, addressing the increasing demands of large-scale infrastructures, such as those required for training Large Language Models. Building upon the proven BXI (Bull eXascale Interconnect) European technology used in TOP15 supercomputers, NET4EXA will deliver the new BXI release, BXIv3, a complete hardware an… ▽ More

    Submitted 27 January, 2026; originally announced January 2026.

  4. arXiv:2601.17136  [pdf, ps, other

    cs.DC cs.LG

    Communication-Avoiding Linear Algebraic Kernel K-Means on GPUs

    Authors: Julian Bellavita, Matthew Rubino, Nakul Iyer, Andrew Chang, Aditya Devarakonda, Flavio Vella, Giulia Guidi

    Abstract: Clustering is an important tool in data analysis, with K-means being popular for its simplicity and versatility. However, it cannot handle non-linearly separable clusters. Kernel K-means addresses this limitation but requires a large kernel matrix, making it computationally and memory intensive. Prior work has accelerated Kernel K-means by formulating it using sparse linear algebra primitives and… ▽ More

    Submitted 27 January, 2026; v1 submitted 23 January, 2026; originally announced January 2026.

    Journal ref: Proceedings of the 40th IEEE International Parallel and Distributed Processing Symposium (IPDPS) 2026

  5. Architecture, Simulation and Software Stack to Support Post-CMOS Accelerators: The ARCHYTAS Project

    Authors: Giovanni Agosta, Stefano Cherubin, Derek Christ, Francesco Conti, Asbjørn Djupdal, Matthias Jung, Georgios Keramidas, Roberto Passerone, Paolo Rech, Elisa Ricci, Philippe Velha, Flavio Vella, Kasim Sinan Yildirim, Nils Wilbert

    Abstract: ARCHYTAS aims to design and evaluate non-conventional hardware accelerators, in particular, optoelectronic, volatile and non-volatile processing-in-memory, and neuromorphic, to tackle the power, efficiency, and scalability bottlenecks of AI with an emphasis on defense use cases (e.g., autonomous vehicles, surveillance drones, maritime and space platforms). In this paper, we present the system arch… ▽ More

    Submitted 18 October, 2025; originally announced October 2025.

    Journal ref: 2025 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)

  6. arXiv:2505.21404  [pdf, ps, other

    cs.LG math.OC

    Dual Natural Gradient Descent for Scalable Training of Physics-Informed Neural Networks

    Authors: Anas Jnini, Flavio Vella

    Abstract: Natural-gradient methods markedly accelerate the training of Physics-Informed Neural Networks (PINNs), yet their Gauss--Newton update must be solved in the parameter space, incurring a prohibitive $O(n^3)$ time complexity, where $n$ is the number of network trainable weights. We show that exactly the same step can instead be formulated in a generally smaller residual space of size… ▽ More

    Submitted 8 October, 2025; v1 submitted 27 May, 2025; originally announced May 2025.

  7. arXiv:2503.11196  [pdf, other

    physics.flu-dyn cs.LG

    Physics-constrained DeepONet for Surrogate CFD models: a curved backward-facing step case

    Authors: Anas Jnini, Harshinee Goordoyal, Sujal Dave, Flavio Vella, Katharine H. Fraser, Artem Korobenko

    Abstract: The Physics-Constrained DeepONet (PC-DeepONet), an architecture that incorporates fundamental physics knowledge into the data-driven DeepONet model, is presented in this study. This methodology is exemplified through surrogate modeling of fluid dynamics over a curved backward-facing step, a benchmark problem in computational fluid dynamics. The model was trained on computational fluid dynamics dat… ▽ More

    Submitted 14 March, 2025; originally announced March 2025.

  8. arXiv:2503.00755  [pdf, ps, other

    cs.LG

    Riemann Tensor Neural Networks: Learning Conservative Systems with Physics-Constrained Networks

    Authors: Anas Jnini, Lorenzo Breschi, Flavio Vella

    Abstract: Divergence-free symmetric tensors (DFSTs) are fundamental in continuum mechanics, encoding conservation laws such as mass and momentum conservation. We introduce Riemann Tensor Neural Networks (RTNNs), a novel neural architecture that inherently satisfies the DFST condition to machine precision, providing a strong inductive bias for enforcing these conservation laws. We prove that RTNNs can approx… ▽ More

    Submitted 16 June, 2025; v1 submitted 2 March, 2025; originally announced March 2025.

    Comments: To be published in the Proceedings of the Forty-Second International Conference on Machine Learning

  9. Popcorn: Accelerating Kernel K-means on GPUs through Sparse Linear Algebra

    Authors: Julian Bellavita, Thomas Pasquali, Laura Del Rio Martin, Flavio Vella, Giulia Guidi

    Abstract: K-means is a popular clustering algorithm with significant applications in numerous scientific and engineering areas. One drawback of K-means is its inability to identify non-linearly separable clusters, which may lead to inaccurate solutions in certain cases. Kernel K-means is a variant of classical K-means that can find non-linearly separable clusters. However, it scales quadratically with respe… ▽ More

    Submitted 9 January, 2025; originally announced January 2025.

  10. The GECo algorithm for Graph Neural Networks Explanation

    Authors: Salvatore Calderaro, Domenico Amato, Giosuè Lo Bosco, Riccardo Rizzo, Filippo Vella

    Abstract: Graph Neural Networks (GNNs) are powerful models that can manage complex data sources and their interconnection links. One of GNNs' main drawbacks is their lack of interpretability, which limits their application in sensitive fields. In this paper, we introduce a new methodology involving graph communities to address the interpretability of graph classification problems. The proposed method, calle… ▽ More

    Submitted 18 November, 2024; originally announced November 2024.

  11. arXiv:2409.09874  [pdf, ps, other

    cs.DC cs.ET cs.PF

    The Landscape of GPU-Centric Communication

    Authors: Didem Unat, Ilyas Turimbetov, Mohammed Kefah Taha Issa, Doğan Sağbili, Flavio Vella, Daniele De Sensi, Ismayil Ismayilov

    Abstract: In recent years, GPUs have become the preferred accelerators for HPC and ML applications due to their parallelism and fast memory bandwidth. While GPUs boost computation, inter-GPU communication can create scalability bottlenecks, especially as the number of GPUs per node and cluster grows. Traditionally, the CPU managed multi-GPU communication, but advancements in GPU-centric communication now ch… ▽ More

    Submitted 22 February, 2026; v1 submitted 15 September, 2024; originally announced September 2024.

  12. arXiv:2408.14090  [pdf, other

    cs.DC cs.AI cs.AR cs.NI cs.PF

    Exploring GPU-to-GPU Communication: Insights into Supercomputer Interconnects

    Authors: Daniele De Sensi, Lorenzo Pichetti, Flavio Vella, Tiziano De Matteis, Zebin Ren, Luigi Fusco, Matteo Turisini, Daniele Cesarini, Kurt Lust, Animesh Trivedi, Duncan Roweth, Filippo Spiga, Salvatore Di Girolamo, Torsten Hoefler

    Abstract: Multi-GPU nodes are increasingly common in the rapidly evolving landscape of exascale supercomputers. On these systems, GPUs on the same node are connected through dedicated networks, with bandwidths up to a few terabits per second. However, gauging performance expectations and maximizing system efficiency is challenging due to different technologies, design options, and software layers. This pape… ▽ More

    Submitted 15 November, 2024; v1 submitted 26 August, 2024; originally announced August 2024.

    ACM Class: C.2.4; C.5.1; C.2.1; C.4

    Journal ref: Published in Proceedings of The International Conference for High Performance Computing Networking, Storage, and Analysis (SC '24) (2024)

  13. arXiv:2408.11551  [pdf, other

    cs.DC

    High Performance Unstructured SpMM Computation Using Tensor Cores

    Authors: Patrik Okanovic, Grzegorz Kwasniewski, Paolo Sylos Labini, Maciej Besta, Flavio Vella, Torsten Hoefler

    Abstract: High-performance sparse matrix-matrix (SpMM) multiplication is paramount for science and industry, as the ever-increasing sizes of data prohibit using dense data structures. Yet, existing hardware, such as Tensor Cores (TC), is ill-suited for SpMM, as it imposes strict constraints on data structures that cannot be met by unstructured sparsity found in many applications. To address this, we introdu… ▽ More

    Submitted 21 August, 2024; originally announced August 2024.

    Comments: Accepted by 2024 International Conference on High Performance Computing, Networking, Storage and Analysis, 2023 (SC'24)

  14. arXiv:2408.10484  [pdf, other

    quant-ph cs.ET

    Dependable Classical-Quantum Computer Systems Engineering

    Authors: Edoardo Giusto, Santiago Nuñez-Corrales, Phuong Cao, Alessandro Cilardo, Ravishankar K. Iyer, Weiwen Jiang, Paolo Rech, Flavio Vella, Bartolomeo Montrucchio, Samudra Dasgupta, Travis S. Humble

    Abstract: Quantum Computing (QC) offers the potential to enhance traditional High-Performance Computing (HPC) workloads by leveraging the unique properties of quantum computers, leading to the emergence of a new paradigm: HPC-QC. While this integration presents new opportunities, it also brings novel challenges, particularly in ensuring the dependability of such hybrid systems. This paper aims to identify i… ▽ More

    Submitted 19 August, 2024; originally announced August 2024.

  15. arXiv:2408.09229  [pdf, other

    cs.DC

    cuVegas: Accelerate Multidimensional Monte Carlo Integration through a Parallelized CUDA-based Implementation of the VEGAS Enhanced Algorithm

    Authors: Emiliano Tolotti, Anas Jnini, Flavio Vella, Roberto Passerone

    Abstract: This paper introduces cuVegas, a CUDA-based implementation of the Vegas Enhanced Algorithm (VEGAS+), optimized for multi-dimensional integration in GPU environments. The VEGAS+ algorithm is an advanced form of Monte Carlo integration, recognized for its adaptability and effectiveness in handling complex, high-dimensional integrands. It employs a combination of variance reduction techniques, namely… ▽ More

    Submitted 17 August, 2024; originally announced August 2024.

    Comments: 25 pages, 8 figures

  16. arXiv:2407.12024  [pdf, other

    cs.HC cs.AI

    Leveraging Large Language Models for enhanced personalised user experience in Smart Homes

    Authors: Jordan Rey-Jouanchicot, André Bottaro, Eric Campo, Jean-Léon Bouraoui, Nadine Vigouroux, Frédéric Vella

    Abstract: Smart home automation systems aim to improve the comfort and convenience of users in their living environment. However, adapting automation to user needs remains a challenge. Indeed, many systems still rely on hand-crafted routines for each smart object.This paper presents an original smart home architecture leveraging Large Language Models (LLMs) and user preferences to push the boundaries of per… ▽ More

    Submitted 28 June, 2024; originally announced July 2024.

  17. On the Efficacy of Surface Codes in Compensating for Radiation Events in Superconducting Devices

    Authors: Marzio Vallero, Gioele Casagranda, Flavio Vella, Paolo Rech

    Abstract: Reliability is fundamental for developing large-scale quantum computers. Since the benefit of technological advancements to the qubit's stability is saturating, algorithmic solutions, such as quantum error correction (QEC) codes, are needed to bridge the gap to reliable computation. Unfortunately, the deployment of the first quantum computers has identified faults induced by natural radiation as a… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

    Comments: 10 pages, 8 figures

  18. State of practice: evaluating GPU performance of state vector and tensor network methods

    Authors: Marzio Vallero, Flavio Vella, Paolo Rech

    Abstract: The frontier of quantum computing (QC) simulation on classical hardware is quickly reaching the hard scalability limits for computational feasibility. Nonetheless, there is still a need to simulate large quantum systems classically, as the Noisy Intermediate Scale Quantum (NISQ) devices are yet to be considered fault tolerant and performant enough in terms of operations per second. Each of the two… ▽ More

    Submitted 3 February, 2025; v1 submitted 11 January, 2024; originally announced January 2024.

    Comments: 13 pages, 10 figures, 1 table

  19. arXiv:2311.13898  [pdf

    cs.HC

    HandiMathKey-Device

    Authors: Frédéric Vella, Nathalie Dubus, Eloise Grolleau, Marjorie Deleau, Cécile Malet, Christine Gallard, Véronique Ades, Nadine Vigouroux

    Abstract: Typing mathematics is sometimes difficult with text editor functions for students with motor impairment and other associated impairments (visual, cognitive). Based on the HandiMathKey software keyboard, a user-centred design method involving the ecosytem of disabled students was applied to design the HMK-D physical keyboard for mathematical input. We opted for the Stream Deck device because of its… ▽ More

    Submitted 23 November, 2023; originally announced November 2023.

    Comments: Universal Access in Human-Computer Interaction. HCII 2023, Jul 2023, Copenhagen (Virtual), Denmark

  20. arXiv:2311.13894  [pdf

    cs.HC

    A first step towards an ecosystem meta-model for humancentered design in case of disabled users

    Authors: Christophe Kolski, Nadine Vigouroux, Yohan Guerrier, Frédéric Vella, Marine Guffroy

    Abstract: The involvement of the ecosystem or social environment of the disabled user is considered as very useful and even essential for the human-centered design of assistive technologies. In the era of model-based approaches, the modeling of the ecosystem is therefore to be considered. The first version of a metamodel of ecosystem is proposed. It is illustrated through a first case study. It concerns a p… ▽ More

    Submitted 23 November, 2023; originally announced November 2023.

    Journal ref: Disab2023 Engineering Interactive Computing Systems for People with Disabilities, Jun 2023, Swansea, United Kingdom

  21. Design Recommendations Based on Speech Analysis for Disability-Friendly Interfaces for the Control of a Home Automation Environment

    Authors: Nadine Vigouroux, Frédéric Vella, Gaëlle Lepage, Éric Campo

    Abstract: The objective of this paper is to describe the study on speech interaction mode for home automation control of equipment by impaired people for an inclusive housing. The study is related to the HIP HOPE project concerning a building of 19 inclusive housing units. 7 participants with different types of disabilities were invited to carry out use cases using voice and touch control. Only the results… ▽ More

    Submitted 22 November, 2023; originally announced November 2023.

    Journal ref: Universal Access in Human-Computer Interaction. HCII 2023, Jul 2023, Copenhagen (Virtual), Denmark. pp.197-211

  22. arXiv:2306.00606  [pdf

    cs.SI cs.DS

    Scaling Expected Force: Efficient Identification of Key Nodes in Network-based Epidemic Models

    Authors: Paolo Sylos Labini, Andrej Jurco, Matteo Ceccarello, Stefano Guarino, Enrico Mastrostefano, Flavio Vella

    Abstract: Centrality measures are fundamental tools of network analysis as they highlight the key actors within the network. This study focuses on a newly proposed centrality measure, Expected Force (EF), and its use in identifying spreaders in network-based epidemic models. We found that EF effectively predicts the spreading power of nodes and identifies key nodes and immunization targets. However, its hig… ▽ More

    Submitted 1 June, 2023; originally announced June 2023.

  23. Multi-GPU aggregation-based AMG preconditioner for iterative linear solvers

    Authors: Massimo Bernaschi, Alessandro Celestini, Pasqua D'Ambra, Flavio Vella

    Abstract: We present and release in open source format a sparse linear solver which efficiently exploits heterogeneous parallel computers. The solver can be easily integrated into scientific applications that need to solve large and sparse linear systems on modern parallel computers made of hybrid nodes hosting NVIDIA Graphics Processing Unit (GPU) accelerators. The work extends our previous efforts in th… ▽ More

    Submitted 4 March, 2023; originally announced March 2023.

    Journal ref: IEEE Transactions on Parallel and Distributed Systems (2023)

  24. Towards a learning-based performance modeling for accelerating Deep Neural Networks

    Authors: Damiano Perri, Paolo Sylos Labini, Osvaldo Gervasi, Sergio Tasso, Flavio Vella

    Abstract: Emerging applications such as Deep Learning are often data-driven, thus traditional approaches based on auto-tuners are not performance effective across the wide range of inputs used in practice. In the present paper, we start an investigation of predictive models based on machine learning techniques in order to optimize Convolution Neural Networks (CNNs). As a use-case, we focus on the ARM Comput… ▽ More

    Submitted 9 December, 2022; originally announced December 2022.

  25. arXiv:2211.13079  [pdf

    cs.HC

    User Centred Method to Design a Platform to Design Augmentative and Alternative Communication Assistive Technologies

    Authors: Frédéric Vella, Flavien Clastres-Babou, Nadine Vigouroux, Philippe Truillet, Charline Calmels, Caroline Mercadier, Karine Gigaud, Margot Issanchou, Kristina Gourinovitch, Anne Garaix

    Abstract: We describe a co-design approach to design the online WebSoKeyTo used to design AAC. This co-design was carried out between a team of therapists and a team of human-computer interaction researchers. Our approach begins with the use and evaluation of an existing SoKeyTo AAC design application. This step was essential in the awareness and definition of the needs by the therapists and in the understa… ▽ More

    Submitted 23 November, 2022; originally announced November 2022.

    Journal ref: HCI INTERNATIONAL 2022 24TH INTERNATIONAL CONFERENCE ON HUMAN-COMPUTER INTERACTION, Jun 2022, Virtual conference, France. pp.559-571, \&\#x27E8;10.1007/978-3-031-17902-0\_40\&\#x27E9

  26. arXiv:2211.13078  [pdf

    cs.HC

    Participation of Stakeholder in the Design of a Conception Application of Augmentative and Alternative Communication

    Authors: Frédéric Vella, Flavien Clastres-Babou, Frédéric Vella, Nadine Vigouroux, Philippe Truillet, Nadine Vigouroux, Charline Calmels, Caroline Mercadier, Karine Gigaud, Margot Issanchou, Kristina Gourinovitch, Anne Garaix

    Abstract: The objective of this paper is to describe the implication of an interdisciplinary team involved during a user-centered design methodology to design the platform (WebSoKeyTo) that meets the needs of therapists to design augmentative and alternative communication (AAC) aids for disabled users. We describe the processes of the design process and the role of the various actors (therapists and human c… ▽ More

    Submitted 23 November, 2022; originally announced November 2022.

    Journal ref: ICCHP-AAATE 2022 Open Access Compendium ''Assistive Technology, Accessibility and (e)Inclusion'', Jul 2022, Lecco, Italy. \&\#x27E8;10.35011/icchp-aaate22-p1-17\&\#x27E9

  27. arXiv:2211.13058  [pdf

    cs.HC cs.NI

    IDEALI: intuitively localising connected devices in order to support autonomy

    Authors: Frédéric Vella, Réjane Dalcé, Antonio Serpa, Thierry Val, Adrien van Den Bossche, Frédéric Vella, Nadine Vigouroux

    Abstract: The ability to localise a smart device is very useful to visually or cognitively impaired people. Localisation-capable technologies are becoming more readily available as off-the-shelf components. In this paper, we highlight the need for such a service in the field of health and autonomy, especially for disabled people. We introduce a model for Semantic Position Description (SPD) (e.g. "The pill o… ▽ More

    Submitted 23 November, 2022; originally announced November 2022.

  28. arXiv:2211.13042  [pdf

    cs.HC

    Usability Study of Tactile and Voice Interaction Modes by People with Disabilities for Home Automation Controls

    Authors: Nadine Vigouroux, Frédéric Vella, Gaëlle Lepage, Eric Campo

    Abstract: This paper presents a comparative usability study on tactile and vocal interaction modes for home automation control of equipment at home for different profiles of disabled people. The study is related to the HIP HOPE project concerning the construction of 19 inclusive housing in the Toulouse metropolitan area in France. The experimentation took place in a living lab with 7 different disabled peop… ▽ More

    Submitted 23 November, 2022; originally announced November 2022.

    Journal ref: ICCHP-AAATE 2022 Open Access Compendium ''Assistive Technology, Accessibility and (e)Inclusion'', Jul 2022, Lecco, Italy. pp.139-147, \&\#x27E8;10.1007/978-3-031-08645-8\_17\&\#x27E9

  29. arXiv:2208.11469  [pdf, other

    cs.DC cs.DS

    ProbGraph: High-Performance and High-Accuracy Graph Mining with Probabilistic Set Representations

    Authors: Maciej Besta, Cesare Miglioli, Paolo Sylos Labini, Jakub Tětek, Patrick Iff, Raghavendra Kanakagiri, Saleh Ashkboos, Kacper Janda, Michal Podstawski, Grzegorz Kwasniewski, Niels Gleinig, Flavio Vella, Onur Mutlu, Torsten Hoefler

    Abstract: Important graph mining problems such as Clustering are computationally demanding. To significantly accelerate these problems, we propose ProbGraph: a graph representation that enables simple and fast approximate parallel graph mining with strong theoretical guarantees on work, depth, and result accuracy. The key idea is to represent sets of vertices using probabilistic set representations such as… ▽ More

    Submitted 21 November, 2022; v1 submitted 24 August, 2022; originally announced August 2022.

    Comments: Best Paper Award at ACM/IEEE Supercomputing'22 (SC22)

    Journal ref: Proceedings of the ACM/IEEE International Conference on High Performance Computing, Networking, Storage and Analysis, November 2022

  30. arXiv:2202.13976  [pdf, other

    cs.DC

    Asynchronous Distributed-Memory Triangle Counting and LCC with RMA Caching

    Authors: András Strausz, Flavio Vella, Salvatore Di Girolamo, Maciej Besta, Torsten Hoefler

    Abstract: Triangle count and local clustering coefficient are two core metrics for graph analysis. They find broad application in analyses such as community detection and link recommendation. Current state-of-the-art solutions suffer from synchronization overheads or expensive pre-computations needed to distribute the graph, achieving limited scaling capabilities. We propose a fully asynchronous implementat… ▽ More

    Submitted 1 March, 2022; v1 submitted 28 February, 2022; originally announced February 2022.

    Comments: 11 pages, 10 figures, to be published at IPDPS'22

  31. arXiv:2202.05868  [pdf, other

    cs.DC

    Blocking Techniques for Sparse Matrix Multiplication on Tensor Accelerators

    Authors: Paolo Sylos Labini, Massimo Bernaschi, Francesco Silvestri, Flavio Vella

    Abstract: Tensor accelerators have gained popularity because they provide a cheap and efficient solution for speeding up computational-expensive tasks in Deep Learning and, more recently, in other Scientific Computing applications. However, since their features are specifically designed for tensor algebra (typically dense matrix-product), it is commonly assumed that they are not suitable for applications wi… ▽ More

    Submitted 11 February, 2022; originally announced February 2022.

    Comments: 12 pages, 14 images

  32. arXiv:1909.01786  [pdf, other

    cs.AI cs.DC

    GPU-based parallelism for ASP-solving

    Authors: Agostino Dovier, Andrea Formisano, Flavio Vella

    Abstract: Answer Set Programming (ASP) has become, the paradigm of choice in the field of logic programming and non-monotonic reasoning. Thanks to the availability of efficient solvers, ASP has been successfully employed in a large number of application domains. The term GPU-computing indicates a recent programming paradigm aimed at enabling the use of modern parallel Graphical Processing Units (GPUs) for g… ▽ More

    Submitted 4 September, 2019; originally announced September 2019.

    Comments: Part of DECLARE 19 proceedings

  33. arXiv:1908.06649  [pdf, other

    cs.DS cs.AR cs.DC cs.LG

    A Computational Model for Tensor Core Units

    Authors: Rezaul Chowdhury, Francesco Silvestri, Flavio Vella

    Abstract: To respond to the need of efficient training and inference of deep neural networks, a plethora of domain-specific hardware architectures have been introduced, such as Google Tensor Processing Units and NVIDIA Tensor Cores. A common feature of these architectures is a hardware circuit for efficiently computing a dense matrix multiplication of a given small size. In order to broaden the class of alg… ▽ More

    Submitted 9 July, 2020; v1 submitted 19 August, 2019; originally announced August 2019.

  34. arXiv:1806.07060  [pdf, other

    cs.PF cs.DC cs.MS cs.SE

    A model-driven approach for a new generation of adaptive libraries

    Authors: Marco Cianfriglia, Flavio Vella, Cedric Nugteren, Anton Lokhmotov, Grigori Fursin

    Abstract: Efficient high-performance libraries often expose multiple tunable parameters to provide highly optimized routines. These can range from simple loop unroll factors or vector sizes all the way to algorithmic changes, given that some implementations can be more suitable for certain devices by exploiting hardware characteristics such as local memories and vector units. Traditionally, such parameters… ▽ More

    Submitted 19 June, 2018; originally announced June 2018.

    Comments: New detailed analysis will be provided

    Report number: Volume 18 Issue 1 Pages 1-24

    Journal ref: ACM Transactions on Architecture and Code Optimization 2021

  35. arXiv:1710.03647  [pdf, other

    cs.DC

    Accelerating Energy Games Solvers on Modern Architectures

    Authors: Andrea Formisano, Raffaella Gentilini, Flavio Vella

    Abstract: Quantitative games, where quantitative objectives are defined on weighted game arenas, provide natural tools for designing faithful models of embedded controllers. Instances of these games that recently gained interest are the so called Energy Games. The fast-known algorithm solves Energy Games in O(EVW) where W is the maximum weight. Starting from a sequential baseline implementation, we investig… ▽ More

    Submitted 10 October, 2017; originally announced October 2017.

  36. arXiv:1707.01489  [pdf, other

    cs.AI cs.RO

    Creative Robot Dance with Variational Encoder

    Authors: Agnese Augello, Emanuele Cipolla, Ignazio Infantino, Adriano Manfre, Giovanni Pilato, Filippo Vella

    Abstract: What we appreciate in dance is the ability of people to sponta- neously improvise new movements and choreographies, sur- rendering to the music rhythm, being inspired by the cur- rent perceptions and sensations and by previous experiences, deeply stored in their memory. Like other human abilities, this, of course, is challenging to reproduce in an artificial entity such as a robot. Recent generati… ▽ More

    Submitted 5 July, 2017; originally announced July 2017.

    Comments: This paper is an extended version of a paper published on the eighth International Conference on Computational Creativity (ICCC), held in Atlanta, GA, June 20th-June 22nd, 2017

  37. arXiv:1609.07008  [pdf, other

    cs.DC cs.DM cs.MS

    Scaling betweenness centrality using communication-efficient sparse matrix multiplication

    Authors: Edgar Solomonik, Maciej Besta, Flavio Vella, Torsten Hoefler

    Abstract: Betweenness centrality (BC) is a crucial graph problem that measures the significance of a vertex by the number of shortest paths leading through it. We propose Maximal Frontier Betweenness Centrality (MFBC): a succinct BC algorithm based on novel sparse matrix multiplication routines that performs a factor of $p^{1/3}$ less communication on $p$ processors than the best known alternatives, for gra… ▽ More

    Submitted 9 August, 2017; v1 submitted 22 September, 2016; originally announced September 2016.

    ACM Class: G.1.0; G.2.2

  38. arXiv:1602.00963  [pdf, other

    cs.DC cs.DS cs.SI

    Algorithms and Heuristics for Scalable Betweenness Centrality Computation on Multi-GPU Systems

    Authors: Flavio Vella, Giancarlo Carbone, Massimo Bernaschi

    Abstract: Betweenness Centrality (BC) is steadily growing in popularity as a metrics of the influence of a vertex in a graph. The BC score of a vertex is proportional to the number of all-pairs-shortest-paths passing through it. However, complete and exact BC computation for a large-scale graph is an extraordinary challenge that requires high performance computing techniques to provide results in a reasonab… ▽ More

    Submitted 2 February, 2016; originally announced February 2016.

    Journal ref: Journal of Experimental Algorithmics (JEA) 2018

  39. arXiv:1601.00669  [pdf, other

    cs.AI

    Artwork creation by a cognitive architecture integrating computational creativity and dual process approaches

    Authors: Agnese Augello, Ignazio Infantino, Antonio Lieto, Giovanni Pilato, Riccardo Rizzo, Filippo Vella

    Abstract: The paper proposes a novel cognitive architecture (CA) for computational creativity based on the Psi model and on the mechanisms inspired by dual process theories of reasoning and rationality. In recent years, many cognitive models have focused on dual process theories to better describe and implement complex cognitive skills in artificial agents, but creativity has been approached only at a descr… ▽ More

    Submitted 4 January, 2016; originally announced January 2016.

    Comments: 30 pages, 8 figures, to appear in Biologically Inspired Cognitive Architectures 2016