Skip to main content

Showing 1–50 of 473 results for author: Ali, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2512.15923  [pdf, ps, other

    cs.LG

    A Unification of Discrete, Gaussian, and Simplicial Diffusion

    Authors: Nuria Alina Chandra, Yucen Lily Li, Alan N. Amin, Alex Ali, Joshua Rollins, Sebastian W. Ober, Aniruddh Raghu, Andrew Gordon Wilson

    Abstract: To model discrete sequences such as DNA, proteins, and language using diffusion, practitioners must choose between three major methods: diffusion in discrete space, Gaussian diffusion in Euclidean space, or diffusion on the simplex. Despite their shared goal, these models have disparate algorithms, theoretical structures, and tradeoffs: discrete diffusion has the most natural domain, Gaussian diff… ▽ More

    Submitted 17 December, 2025; originally announced December 2025.

  2. arXiv:2512.12348  [pdf, ps, other

    cs.HC

    Understanding Trust Toward Human versus AI-generated Health Information through Behavioral and Physiological Sensing

    Authors: Xin Sun, Rongjun Ma, Shu Wei, Pablo Cesar, Jos A. Bosch, Abdallah El Ali

    Abstract: As AI-generated health information proliferates online and becomes increasingly indistinguishable from human-sourced information, it becomes critical to understand how people trust and label such content, especially when the information is inaccurate. We conducted two complementary studies: (1) a mixed-methods survey (N=142) employing a 2 (source: Human vs. LLM) $\times$ 2 (label: Human vs. AI)… ▽ More

    Submitted 13 December, 2025; originally announced December 2025.

  3. arXiv:2512.12236  [pdf, ps, other

    eess.IV cs.CV

    Resolution-Independent Neural Operators for Multi-Rate Sparse-View CT

    Authors: Aujasvit Datta, Jiayun Wang, Asad Aali, Armeet Singh Jatyani, Anima Anandkumar

    Abstract: Sparse-view Computed Tomography (CT) reconstructs images from a limited number of X-ray projections to reduce radiation and scanning time, which makes reconstruction an ill-posed inverse problem. Deep learning methods achieve high-fidelity reconstructions but often overfit to a fixed acquisition setup, failing to generalize across sampling rates and image resolutions. For example, convolutional ne… ▽ More

    Submitted 13 December, 2025; originally announced December 2025.

  4. arXiv:2512.12108  [pdf, ps, other

    cs.CV cs.LG

    A Novel Patch-Based TDA Approach for Computed Tomography

    Authors: Dashti A. Ali, Aras T. Asaad, Jacob J. Peoples, Mohammad Hamghalam, Alex Robins, Mane Piliposyan, Richard K. G. Do, Natalie Gangai, Yun S. Chun, Ahmad Bashir Barekzai, Jayasree Chakraborty, Hala Khasawneh, Camila Vilela, Natally Horvat, João Miranda, Alice C. Wei, Amber L. Simpson

    Abstract: The development of machine learning (ML) models based on computed tomography (CT) imaging modality has been a major focus of recent research in the medical imaging domain. Incorporating robust feature engineering approach can highly improve the performance of these models. Topological data analysis (TDA), a recent development based on the mathematical field of algebraic topology, mainly focuses on… ▽ More

    Submitted 12 December, 2025; originally announced December 2025.

  5. arXiv:2512.10319  [pdf, ps, other

    cs.RO cs.CV eess.SY

    Design of a six wheel suspension and a three-axis linear actuation mechanism for a laser weeding robot

    Authors: Muhammad Usama, Muhammad Ibrahim Khan, Ahmad Hasan, Muhammad Shaaf Nadeem, Khawaja Fahad Iqbal, Jawad Aslam, Mian Ashfaq Ali, Asad Nisar Awan

    Abstract: Mobile robots are increasingly utilized in agriculture to automate labor-intensive tasks such as weeding, sowing, harvesting and soil analysis. Recently, agricultural robots have been developed to detect and remove weeds using mechanical tools or precise herbicide sprays. Mechanical weeding is inefficient over large fields, and herbicides harm the soil ecosystem. Laser weeding with mobile robots h… ▽ More

    Submitted 11 December, 2025; originally announced December 2025.

    Comments: 15 Pages, 10 figures

  6. arXiv:2512.01160  [pdf, ps, other

    cs.LG q-bio.MN

    From Regression to Classification: Exploring the Benefits of Categorical Representations of Energy in MLIPs

    Authors: Ahmad Ali

    Abstract: Density Functional Theory (DFT) is a widely used computational method for estimating the energy and behavior of molecules. Machine Learning Interatomic Potentials (MLIPs) are models trained to approximate DFT-level energies and forces at dramatically lower computational cost. Many modern MLIPs rely on a scalar regression formulation; given information about a molecule, they predict a single energy… ▽ More

    Submitted 30 November, 2025; originally announced December 2025.

    Comments: 11th Annual Conference on Vision and Intelligent Systems (CVIS 2025)

  7. arXiv:2511.23366  [pdf, ps, other

    cs.AI cs.MA

    Agentic AI Framework for Smart Inventory Replenishment

    Authors: Toqeer Ali Syed, Salman Jan, Gohar Ali, Ali Akarma, Ahmad Ali, Qurat-ul-Ain Mastoi

    Abstract: In contemporary retail, the variety of products available (e.g. clothing, groceries, cosmetics, frozen goods) make it difficult to predict the demand, prevent stockouts, and find high-potential products. We suggest an agentic AI model that will be used to monitor the inventory, initiate purchase attempts to the appropriate suppliers, and scan for trending or high-margin products to incorporate. Th… ▽ More

    Submitted 28 November, 2025; originally announced November 2025.

    Comments: Presented at International Conference on Business and Digital Technology, Bahrain, Springer Nature, 27 November 2025

  8. arXiv:2511.22767  [pdf, ps, other

    cs.AI cs.MA

    Agentic AI Framework for Cloudburst Prediction and Coordinated Response

    Authors: Toqeer Ali Syed, Sohail Khan, Salman Jan, Gohar Ali, Muhammad Nauman, Ali Akarma, Ahmad Ali

    Abstract: The challenge is growing towards extreme and short-duration rainfall events like a cloudburst that are peculiar to the traditional forecasting systems, in which the predictions and the response are taken as two distinct processes. The paper outlines an agentic artificial intelligence system to study atmospheric water-cycle intelligence, which combines sensing, forecasting, downscaling, hydrologica… ▽ More

    Submitted 27 November, 2025; originally announced November 2025.

    Comments: Presented at International Conference on Business and Digital Technology, Bahrain, Springer Nature, 27 November 2025

  9. arXiv:2511.22737  [pdf, ps, other

    cs.AI cs.HC

    Agentic AI Framework for Individuals with Disabilities and Neurodivergence: A Multi-Agent System for Healthy Eating, Daily Routines, and Inclusive Well-Being

    Authors: Salman Jan, Toqeer Ali Syed, Gohar Ali, Ali Akarma, Mohammad Riyaz Belgaum, Ahmad Ali

    Abstract: The paper presents a detailed Agentic Artificial Intelligence (AI) model that would enable people with disabilities and neurodivergence to lead healthier lives and have more regular days. The system will use a multi-layer structure; it will include an Application and Interface Layer, an Agents Layer, and a Data Source Layer to provide adaptive, transparent, and inclusive support. Fundamentally, a… ▽ More

    Submitted 27 November, 2025; originally announced November 2025.

    Comments: Presented at International Conference on Business and Digital Technology, Bahrain, Springer Nature, 27 November 2025

  10. arXiv:2511.20836  [pdf, ps, other

    cs.CL cs.AI cs.LG

    Structured Prompting Enables More Robust Evaluation of Language Models

    Authors: Asad Aali, Muhammad Ahmed Mohsin, Vasiliki Bikia, Arnav Singhvi, Richard Gaus, Suhana Bedi, Hejie Cui, Miguel Fuentes, Alyssa Unell, Yifan Mai, Jordan Cahoon, Michael Pfeffer, Roxana Daneshjou, Sanmi Koyejo, Emily Alsentzer, Christopher Potts, Nigam H. Shah, Akshay S. Chaudhari

    Abstract: As language models (LMs) are increasingly adopted across domains, high-quality benchmarking frameworks that accurately estimate performance are essential for guiding deployment decisions. While frameworks such as Holistic Evaluation of Language Models (HELM) enable broad evaluation across tasks, they often rely on fixed prompts that fail to generalize across LMs, yielding unrepresentative performa… ▽ More

    Submitted 27 November, 2025; v1 submitted 25 November, 2025; originally announced November 2025.

  11. arXiv:2511.20274  [pdf, ps, other

    cs.CV

    ScenarioCLIP: Pretrained Transferable Visual Language Models and Action-Genome Dataset for Natural Scene Analysis

    Authors: Advik Sinha, Saurabh Atreya, Aashutosh A V, Sk Aziz Ali, Abhijit Das

    Abstract: Until recently, the general corpus of CLIP-type fundamental models has widely explored either the retrieval of short descriptions or the classification of objects in the scene as SINGLE-object image classification task. The same holds for retrieving the image embedding (image retrieval task) given a text prompt. However, real-world scene images exhibit rich compositional structure involving multip… ▽ More

    Submitted 25 November, 2025; originally announced November 2025.

  12. arXiv:2511.19248  [pdf, ps, other

    cs.CR cs.CV

    FedPoisonTTP: A Threat Model and Poisoning Attack for Federated Test-Time Personalization

    Authors: Md Akil Raihan Iftee, Syed Md. Ahnaf Hasan, Amin Ahsan Ali, AKM Mahbubur Rahman, Sajib Mistry, Aneesh Krishna

    Abstract: Test-time personalization in federated learning enables models at clients to adjust online to local domain shifts, enhancing robustness and personalization in deployment. Yet, existing federated learning work largely overlooks the security risks that arise when local adaptation occurs at test time. Heterogeneous domain arrivals, diverse adaptation algorithms, and limited cross-client visibility cr… ▽ More

    Submitted 24 November, 2025; originally announced November 2025.

    Comments: 13 pages, 3 figures, 2 tables

  13. arXiv:2511.18468  [pdf, ps, other

    cs.LG cs.CV

    SloMo-Fast: Slow-Momentum and Fast-Adaptive Teachers for Source-Free Continual Test-Time Adaptation

    Authors: Md Akil Raihan Iftee, Mir Sazzat Hossain, Rakibul Hasan Rajib, Tariq Iqbal, Md Mofijul Islam, M Ashraful Amin, Amin Ahsan Ali, AKM Mahbubur Rahman

    Abstract: Continual Test-Time Adaptation (CTTA) is crucial for deploying models in real-world applications with unseen, evolving target domains. Existing CTTA methods, however, often rely on source data or prototypes, limiting their applicability in privacy-sensitive and resource-constrained settings. Additionally, these methods suffer from long-term forgetting, which degrades performance on previously enco… ▽ More

    Submitted 23 November, 2025; originally announced November 2025.

    Comments: 38 pages, 38 tables, 16 figures

  14. arXiv:2511.18066  [pdf, ps, other

    cs.LG cs.CV

    pFedBBN: A Personalized Federated Test-Time Adaptation with Balanced Batch Normalization for Class-Imbalanced Data

    Authors: Md Akil Raihan Iftee, Syed Md. Ahnaf Hasan, Mir Sazzat Hossain, Rakibul Hasan Rajib, Amin Ahsan Ali, AKM Mahbubur Rahman, Sajib Mistry, Monowar Bhuyan

    Abstract: Test-time adaptation (TTA) in federated learning (FL) is crucial for handling unseen data distributions across clients, particularly when faced with domain shifts and skewed class distributions. Class Imbalance (CI) remains a fundamental challenge in FL, where rare but critical classes are often severely underrepresented in individual client datasets. Although prior work has addressed CI during tr… ▽ More

    Submitted 22 November, 2025; originally announced November 2025.

    Comments: 25 pages, 7 tables, 21 figures

  15. arXiv:2511.13188  [pdf, ps, other

    cs.RO eess.SY

    Collision-Free Navigation of Mobile Robots via Quadtree-Based Model Predictive Control

    Authors: Osama Al Sheikh Ali, Sotiris Koutsoftas, Ze Zhang, Knut Akesson, Emmanuel Dean

    Abstract: This paper presents an integrated navigation framework for Autonomous Mobile Robots (AMRs) that unifies environment representation, trajectory generation, and Model Predictive Control (MPC). The proposed approach incorporates a quadtree-based method to generate structured, axis-aligned collision-free regions from occupancy maps. These regions serve as both a basis for developing safe corridors and… ▽ More

    Submitted 17 November, 2025; originally announced November 2025.

    Comments: This paper has been accepted by IEEE SII 2026

  16. arXiv:2511.12220  [pdf, ps, other

    cs.CV cs.LG

    Suppressing VLM Hallucinations with Spectral Representation Filtering

    Authors: Ameen Ali, Tamim Zoabi, Lior Wolf

    Abstract: Vision-language models (VLMs) frequently produce hallucinations in the form of descriptions of objects, attributes, or relations that do not exist in the image due to over-reliance on language priors and imprecise cross-modal grounding. We introduce Spectral Representation Filtering (SRF), a lightweight, training-free method to suppress such hallucinations by analyzing and correcting the covarianc… ▽ More

    Submitted 15 November, 2025; originally announced November 2025.

  17. arXiv:2511.11898  [pdf, ps, other

    cs.CV cs.AI

    Prompt Triage: Structured Optimization Enhances Vision-Language Model Performance on Medical Imaging Benchmarks

    Authors: Arnav Singhvi, Vasiliki Bikia, Asad Aali, Akshay Chaudhari, Roxana Daneshjou

    Abstract: Vision-language foundation models (VLMs) show promise for diverse imaging tasks but often underperform on medical benchmarks. Prior efforts to improve performance include model finetuning, which requires large domain-specific datasets and significant compute, or manual prompt engineering, which is hard to generalize and often inaccessible to medical institutions seeking to deploy these tools. Thes… ▽ More

    Submitted 14 November, 2025; originally announced November 2025.

  18. arXiv:2511.04153  [pdf, ps, other

    cs.CL cs.AI cs.DB cs.MA

    BAPPA: Benchmarking Agents, Plans, and Pipelines for Automated Text-to-SQL Generation

    Authors: Fahim Ahmed, Md Mubtasim Ahasan, Jahir Sadik Monon, Muntasir Wahed, M Ashraful Amin, A K M Mahbubur Rahman, Amin Ahsan Ali

    Abstract: Text-to-SQL systems provide a natural language interface that can enable even laymen to access information stored in databases. However, existing Large Language Models (LLM) struggle with SQL generation from natural instructions due to large schema sizes and complex reasoning. Prior work often focuses on complex, somewhat impractical pipelines using flagship models, while smaller, efficient models… ▽ More

    Submitted 6 November, 2025; originally announced November 2025.

  19. arXiv:2511.00062  [pdf, ps, other

    cs.CV cs.AI cs.LG cs.RO

    World Simulation with Video Foundation Models for Physical AI

    Authors: NVIDIA, :, Arslan Ali, Junjie Bai, Maciej Bala, Yogesh Balaji, Aaron Blakeman, Tiffany Cai, Jiaxin Cao, Tianshi Cao, Elizabeth Cha, Yu-Wei Chao, Prithvijit Chattopadhyay, Mike Chen, Yongxin Chen, Yu Chen, Shuai Cheng, Yin Cui, Jenna Diamond, Yifan Ding, Jiaojiao Fan, Linxi Fan, Liang Feng, Francesco Ferroni, Sanja Fidler , et al. (65 additional authors not shown)

    Abstract: We introduce [Cosmos-Predict2.5], the latest generation of the Cosmos World Foundation Models for Physical AI. Built on a flow-based architecture, [Cosmos-Predict2.5] unifies Text2World, Image2World, and Video2World generation in a single model and leverages [Cosmos-Reason1], a Physical AI vision-language model, to provide richer text grounding and finer control of world simulation. Trained on 200… ▽ More

    Submitted 28 October, 2025; originally announced November 2025.

  20. Agentic AI: A Comprehensive Survey of Architectures, Applications, and Future Directions

    Authors: Mohamad Abou Ali, Fadi Dornaika

    Abstract: Agentic AI represents a transformative shift in artificial intelligence, but its rapid advancement has led to a fragmented understanding, often conflating modern neural systems with outdated symbolic models -- a practice known as conceptual retrofitting. This survey cuts through this confusion by introducing a novel dual-paradigm framework that categorizes agentic systems into two distinct lineage… ▽ More

    Submitted 29 October, 2025; originally announced October 2025.

  21. arXiv:2510.22190  [pdf, ps, other

    astro-ph.IM astro-ph.CO cs.LG

    RGC: a radio AGN classifier based on deep learning. I. A semi-supervised model for the VLA images of bent radio AGNs

    Authors: M. S. Hossain, M. S. H. Shahal, A. Khan, K. M. B. Asad, P. Saikia, F. Akter, A. Ali, M. A. Amin, A. Momen, M. Hasan, A. K. M. M. Rahman

    Abstract: Wide-angle tail (WAT) and narrow-angle tail (NAT) radio active galactic nuclei (RAGNs) are key tracers of dense environments in galaxy groups and clusters, yet no machine-learning classifier of bent RAGNs has been trained using both unlabeled data and purely visually inspected labels. We release the RGC Python package, which includes two newly preprocessed labeled datasets of 639 WATs and NATs der… ▽ More

    Submitted 25 October, 2025; originally announced October 2025.

    Comments: 12 pages, 7 pages appendix, 6 figures, submitted to A&A

  22. arXiv:2510.16075  [pdf, ps, other

    cs.LG cs.AI

    Optimization of the quantization of dense neural networks from an exact QUBO formulation

    Authors: Sergio Muñiz Subiñas, Manuel L. González, Jorge Ruiz Gómez, Alejandro Mata Ali, Jorge Martínez Martín, Miguel Franco Hernando, Ángel Miguel García-Vico

    Abstract: This work introduces a post-training quantization (PTQ) method for dense neural networks via a novel ADAROUND-based QUBO formulation. Using the Frobenius distance between the theoretical output and the dequantized output (before the activation function) as the objective, an explicit QUBO whose binary variables represent the rounding choice for each weight and bias is obtained. Additionally, by exp… ▽ More

    Submitted 17 October, 2025; originally announced October 2025.

  23. arXiv:2510.13670  [pdf, ps, other

    cs.CV

    NTIRE 2025 Challenge on Low Light Image Enhancement: Methods and Results

    Authors: Xiaoning Liu, Zongwei Wu, Florin-Alexandru Vasluianu, Hailong Yan, Bin Ren, Yulun Zhang, Shuhang Gu, Le Zhang, Ce Zhu, Radu Timofte, Kangbiao Shi, Yixu Feng, Tao Hu, Yu Cao, Peng Wu, Yijin Liang, Yanning Zhang, Qingsen Yan, Han Zhou, Wei Dong, Yan Min, Mohab Kishawy, Jun Chen, Pengpeng Yu, Anjin Park , et al. (80 additional authors not shown)

    Abstract: This paper presents a comprehensive review of the NTIRE 2025 Low-Light Image Enhancement (LLIE) Challenge, highlighting the proposed solutions and final outcomes. The objective of the challenge is to identify effective networks capable of producing brighter, clearer, and visually compelling images under diverse and challenging conditions. A remarkable total of 762 participants registered for the c… ▽ More

    Submitted 15 October, 2025; originally announced October 2025.

    Comments: CVPR NTIRE 2025 Workshop, please refer to https://openaccess.thecvf.com/CVPR2025_workshops/NTIRE

  24. arXiv:2510.13630   

    cs.CV

    AVAR-Net: A Lightweight Audio-Visual Anomaly Recognition Framework with a Benchmark Dataset

    Authors: Amjid Ali, Zulfiqar Ahmad Khan, Altaf Hussain, Muhammad Munsif, Adnan Hussain, Sung Wook Baik

    Abstract: Anomaly recognition plays a vital role in surveillance, transportation, healthcare, and public safety. However, most existing approaches rely solely on visual data, making them unreliable under challenging conditions such as occlusion, low illumination, and adverse weather. Moreover, the absence of large-scale synchronized audio-visual datasets has hindered progress in multimodal anomaly recogniti… ▽ More

    Submitted 11 November, 2025; v1 submitted 15 October, 2025; originally announced October 2025.

    Comments: I would like to request the withdrawal of my paper . The reason for this request is that I am currently working on additional experiments and analyses, which will lead to updates in the results section. Once these updates are complete, I will resubmit the revised version. Thank you for your understanding

  25. Saudi Sign Language Translation Using T5

    Authors: Ali Alhejab, Tomas Zelezny, Lamya Alkanhal, Ivan Gruber, Yazeed Alharbi, Jakub Straka, Vaclav Javorek, Marek Hruz, Badriah Alkalifah, Ahmed Ali

    Abstract: This paper explores the application of T5 models for Saudi Sign Language (SSL) translation using a novel dataset. The SSL dataset includes three challenging testing protocols, enabling comprehensive evaluation across different scenarios. Additionally, it captures unique SSL characteristics, such as face coverings, which pose challenges for sign recognition and translation. In our experiments, we i… ▽ More

    Submitted 13 October, 2025; originally announced October 2025.

    Comments: 11 pages, supplementary, SPECOM 2025

    Journal ref: Speech and Computer (SPECOM 2025), Lecture Notes in Computer Science, vol. 16188, pp. 331-343, Springer, Cham (2025)

  26. arXiv:2510.10263  [pdf, ps, other

    cs.HC cs.AI cs.CY cs.LG

    Unveiling Gamer Archetypes through Multi modal feature Correlations and Unsupervised Learning

    Authors: Moona Kanwal, Muhammad Sami Siddiqui, Syed Anael Ali

    Abstract: Profiling gamers provides critical insights for adaptive game design, behavioral understanding, and digital well-being. This study proposes an integrated, data-driven framework that combines psychological measures, behavioral analytics, and machine learning to reveal underlying gamer personas. A structured survey of 250 participants, including 113 active gamers, captured multidimensional behaviora… ▽ More

    Submitted 11 October, 2025; originally announced October 2025.

    Comments: Submitted to Peer Review Journal

  27. arXiv:2510.10077  [pdf, ps, other

    cs.CL

    A-IPO: Adaptive Intent-driven Preference Optimization

    Authors: Wenqing Wang, Muhammad Asif Ali, Ali Shoker, Ruohan Yang, Junyang Chen, Ying Sha, Huan Wang

    Abstract: Human preferences are diverse and dynamic, shaped by regional, cultural, and social factors. Existing alignment methods like Direct Preference Optimization (DPO) and its variants often default to majority views, overlooking minority opinions and failing to capture latent user intentions in prompts. To address these limitations, we introduce \underline{\textbf{A}}daptive \textbf{\underline{I}}nte… ▽ More

    Submitted 11 October, 2025; originally announced October 2025.

  28. arXiv:2510.09731  [pdf, ps, other

    cs.CV

    Multi Camera Connected Vision System with Multi View Analytics: A Comprehensive Survey

    Authors: Muhammad Munsif, Waqas Ahmad, Amjid Ali, Mohib Ullah, Adnan Hussain, Sung Wook Baik

    Abstract: Connected Vision Systems (CVS) are transforming a variety of applications, including autonomous vehicles, smart cities, surveillance, and human-robot interaction. These systems harness multi-view multi-camera (MVMC) data to provide enhanced situational awareness through the integration of MVMC tracking, re-identification (Re-ID), and action understanding (AU). However, deploying CVS in real-world,… ▽ More

    Submitted 10 October, 2025; originally announced October 2025.

  29. arXiv:2510.03788  [pdf

    cs.CE cs.AI

    Lightweight and Data-Efficient MultivariateTime Series Forecasting using Residual-Stacked Gaussian (RS-GLinear) Architecture

    Authors: Abukar Ali

    Abstract: Following the success of Transformer architectures in language modeling, particularly their ability to capture long-range dependencies, researchers have explored how these architectures can be adapted for time-series forecasting. Transformer-based models have been proposed to handle both short- and long-term dependencies when predicting future values from historical data. However, studies such as… ▽ More

    Submitted 4 October, 2025; originally announced October 2025.

  30. arXiv:2510.03769  [pdf, ps, other

    cs.CV eess.SP

    Efficiency vs. Efficacy: Assessing the Compression Ratio-Dice Score Relationship through a Simple Benchmarking Framework for Cerebrovascular 3D Segmentation

    Authors: Shimaa Elbana, Ahmad Kamal, Shahd Ahmed Ali, Ahmad Al-Kabbany

    Abstract: The increasing size and complexity of medical imaging datasets, particularly in 3D formats, present significant barriers to collaborative research and transferability. This study investigates whether the ZFP compression technique can mitigate these challenges without compromising the performance of automated cerebrovascular segmentation, a critical first step in intracranial aneurysm detection. We… ▽ More

    Submitted 16 December, 2025; v1 submitted 4 October, 2025; originally announced October 2025.

  31. arXiv:2510.01439  [pdf, ps, other

    cs.LG

    Edge Artificial Intelligence: A Systematic Review of Evolution, Taxonomic Frameworks, and Future Horizons

    Authors: Mohamad Abou Ali, Fadi Dornaika

    Abstract: Edge Artificial Intelligence (Edge AI) embeds intelligence directly into devices at the network edge, enabling real-time processing with improved privacy and reduced latency by processing data close to its source. This review systematically examines the evolution, current landscape, and future directions of Edge AI through a multi-dimensional taxonomy including deployment location, processing capa… ▽ More

    Submitted 1 October, 2025; originally announced October 2025.

  32. arXiv:2509.24595  [pdf, ps, other

    cs.CV

    Comprehensive Benchmarking of YOLOv11 Architectures for Scalable and Granular Peripheral Blood Cell Detection

    Authors: Mohamad Abou Ali, Mariam Abdulfattah, Baraah Al Hussein, Fadi Dornaika, Ali Cherry, Mohamad Hajj-Hassan, Lara Hamawy

    Abstract: Manual peripheral blood smear (PBS) analysis is labor intensive and subjective. While deep learning offers a promising alternative, a systematic evaluation of state of the art models such as YOLOv11 for fine grained PBS detection is still lacking. In this work, we make two key contributions. First, we curate a large scale annotated dataset for blood cell detection and classification, comprising 16… ▽ More

    Submitted 29 September, 2025; originally announced September 2025.

  33. arXiv:2509.22075  [pdf, ps, other

    cs.CL cs.AI

    COSPADI: Compressing LLMs via Calibration-Guided Sparse Dictionary Learning

    Authors: Dmitriy Shopkhoev, Denis Makhov, Magauiya Zhussip, Ammar Ali, Stamatios Lefkimmiatis

    Abstract: Post-training compression of large language models (LLMs) largely relies on low-rank weight approximation, which represents each column of a weight matrix in a shared low-dimensional subspace. While this is a computationally efficient strategy, the imposed structural constraint is rigid and can lead to a noticeable model accuracy drop. In this work, we propose CoSpaDi (Compression via Sparse Dicti… ▽ More

    Submitted 6 October, 2025; v1 submitted 26 September, 2025; originally announced September 2025.

  34. arXiv:2509.21531  [pdf, ps, other

    eess.IV cs.CV

    Patch-Based Diffusion for Data-Efficient, Radiologist-Preferred MRI Reconstruction

    Authors: Rohan Sanda, Asad Aali, Andrew Johnston, Eduardo Reis, Gordon Wetzstein, Sara Fridovich-Keil

    Abstract: Magnetic resonance imaging (MRI) requires long acquisition times, raising costs, reducing accessibility, and making scans more susceptible to motion artifacts. Diffusion probabilistic models that learn data-driven priors can potentially assist in reducing acquisition time. However, they typically require large training datasets that can be prohibitively expensive to collect. Patch-based diffusion… ▽ More

    Submitted 15 December, 2025; v1 submitted 25 September, 2025; originally announced September 2025.

    Comments: Code is available at: https://github.com/voilalab/PaDIS-MRI

  35. arXiv:2509.21459  [pdf, ps, other

    cs.CL cs.AI cs.DB cs.LG

    A State-of-the-Art SQL Reasoning Model using RLVR

    Authors: Alnur Ali, Ashutosh Baheti, Jonathan Chang, Ta-Chung Chi, Brandon Cui, Andrew Drozdov, Jonathan Frankle, Abhay Gupta, Pallavi Koppol, Sean Kulinski, Jonathan Li, Dipendra Misra, Krista Opsahl-Ong, Jose Javier Gonzalez Ortiz, Matei Zaharia, Yue Zhang

    Abstract: Developing custom reasoning models via Reinforcement Learning (RL) that can incorporate organization-specific knowledge has great potential to address problems faced by enterprise customers. In many of these problems, the reward function is verifiable, a setting termed RL with Verifiable Rewards (RLVR). We apply RLVR to a popular data science benchmark called BIRD that measures the ability of an A… ▽ More

    Submitted 25 September, 2025; originally announced September 2025.

  36. arXiv:2509.15182  [pdf, ps, other

    cs.DC

    Conditional Prior-based Non-stationary Channel Estimation Using Accelerated Diffusion Models

    Authors: Muhammad Ahmed Mohsin, Ahsan Bilal, Muhammad Umer, Asad Aali, Muhammad Ali Jamshed, Dean F. Hougen, John M. Cioffi

    Abstract: Wireless channels in motion-rich urban microcell (UMi) settings are non-stationary; mobility and scatterer dynamics shift the distribution over time, degrading classical and deep estimators. This work proposes conditional prior diffusion for channel estimation, which learns a history-conditioned score to denoise noisy channel snapshots. A temporal encoder with cross-time attention compresses a sho… ▽ More

    Submitted 18 September, 2025; originally announced September 2025.

    Comments: ICASSP 2026

  37. arXiv:2509.14536  [pdf, ps, other

    cs.LG

    Predicting Case Suffixes With Activity Start and End Times: A Sweep-Line Based Approach

    Authors: Muhammad Awais Ali, Marlon Dumas, Fredrik Milani

    Abstract: Predictive process monitoring techniques support the operational decision making by predicting future states of ongoing cases of a business process. A subset of these techniques predict the remaining sequence of activities of an ongoing case (case suffix prediction). Existing approaches for case suffix prediction generate sequences of activities with a single timestamp (e.g. the end timestamp). Th… ▽ More

    Submitted 17 September, 2025; originally announced September 2025.

  38. arXiv:2509.14161  [pdf, ps, other

    cs.CL cs.SD eess.AS

    CS-FLEURS: A Massively Multilingual and Code-Switched Speech Dataset

    Authors: Brian Yan, Injy Hamed, Shuichiro Shimizu, Vasista Lodagala, William Chen, Olga Iakovenko, Bashar Talafha, Amir Hussein, Alexander Polok, Kalvin Chang, Dominik Klement, Sara Althubaiti, Puyuan Peng, Matthew Wiesner, Thamar Solorio, Ahmed Ali, Sanjeev Khudanpur, Shinji Watanabe, Chih-Chen Chen, Zhen Wu, Karim Benharrak, Anuj Diwan, Samuele Cornell, Eunjung Yeo, Kwanghee Choi , et al. (2 additional authors not shown)

    Abstract: We present CS-FLEURS, a new dataset for developing and evaluating code-switched speech recognition and translation systems beyond high-resourced languages. CS-FLEURS consists of 4 test sets which cover in total 113 unique code-switched language pairs across 52 languages: 1) a 14 X-English language pair set with real voices reading synthetically generated code-switched sentences, 2) a 16 X-English… ▽ More

    Submitted 17 September, 2025; originally announced September 2025.

  39. arXiv:2509.12459  [pdf, ps, other

    cs.CL

    Does Language Model Understand Language?

    Authors: Suvojit Acharjee, Utathya Aich, Asfak Ali

    Abstract: Despite advances in natural language generation and understanding, LM still struggle with fine grained linguistic phenomena such as tense, negation, voice, and modality which are the elements central to effective human communication. In the context of the United Nations SDG 4, where linguistic clarity is critical, the deployment of LMs in educational technologies demands careful scrutiny. As LMs a… ▽ More

    Submitted 15 September, 2025; originally announced September 2025.

  40. arXiv:2509.11425  [pdf, ps, other

    cs.SD cs.AI cs.CL eess.AS

    FuseCodec: Semantic-Contextual Fusion and Supervision for Neural Codecs

    Authors: Md Mubtasim Ahasan, Rafat Hasan Khan, Tasnim Mohiuddin, Aman Chadha, Tariq Iqbal, M Ashraful Amin, Amin Ahsan Ali, Md Mofijul Islam, A K M Mahbubur Rahman

    Abstract: Speech tokenization enables discrete representation and facilitates speech language modeling. However, existing neural codecs capture low-level acoustic features, overlooking the semantic and contextual cues inherent to human speech. While recent efforts introduced semantic representations from self-supervised speech models or incorporated contextual representations from pre-trained language model… ▽ More

    Submitted 29 September, 2025; v1 submitted 14 September, 2025; originally announced September 2025.

  41. arXiv:2509.10652  [pdf, ps, other

    cs.HC cs.AI cs.CY cs.ET

    Vibe Coding for UX Design: Understanding UX Professionals' Perceptions of AI-Assisted Design and Development

    Authors: Jie Li, Youyang Hou, Laura Lin, Ruihao Zhu, Hancheng Cao, Abdallah El Ali

    Abstract: Generative AI is reshaping UX design practices through "vibe coding," where UX professionals express intent in natural language and AI translates it into functional prototypes and code. Despite rapid adoption, little research has examined how vibe coding reconfigures UX workflows and collaboration. Drawing on interviews with 20 UX professionals across enterprises, startups, and academia, we show h… ▽ More

    Submitted 12 September, 2025; originally announced September 2025.

  42. arXiv:2509.08803  [pdf, ps, other

    cs.SI cs.AI cs.CL cs.CY

    Scaling Truth: The Confidence Paradox in AI Fact-Checking

    Authors: Ihsan A. Qazi, Zohaib Khan, Abdullah Ghani, Agha A. Raza, Zafar A. Qazi, Wassay Sajjad, Ayesha Ali, Asher Javaid, Muhammad Abdullah Sohail, Abdul H. Azeemi

    Abstract: The rise of misinformation underscores the need for scalable and reliable fact-checking solutions. Large language models (LLMs) hold promise in automating fact verification, yet their effectiveness across global contexts remains uncertain. We systematically evaluate nine established LLMs across multiple categories (open/closed-source, multiple sizes, diverse architectures, reasoning-based) using 5… ▽ More

    Submitted 10 September, 2025; originally announced September 2025.

    Comments: 65 pages, 26 figures, 6 tables

  43. arXiv:2509.05878  [pdf, ps, other

    cs.CL

    MedFactEval and MedAgentBrief: A Framework and Workflow for Generating and Evaluating Factual Clinical Summaries

    Authors: François Grolleau, Emily Alsentzer, Timothy Keyes, Philip Chung, Akshay Swaminathan, Asad Aali, Jason Hom, Tridu Huynh, Thomas Lew, April S. Liang, Weihan Chu, Natasha Z. Steele, Christina F. Lin, Jingkun Yang, Kameron C. Black, Stephen P. Ma, Fateme N. Haredasht, Nigam H. Shah, Kevin Schulman, Jonathan H. Chen

    Abstract: Evaluating factual accuracy in Large Language Model (LLM)-generated clinical text is a critical barrier to adoption, as expert review is unscalable for the continuous quality assurance these systems require. We address this challenge with two complementary contributions. First, we introduce MedFactEval, a framework for scalable, fact-grounded evaluation where clinicians define high-salience key fa… ▽ More

    Submitted 6 September, 2025; originally announced September 2025.

  44. arXiv:2509.00367  [pdf, ps, other

    cs.CV

    A Multimodal and Multi-centric Head and Neck Cancer Dataset for Segmentation, Diagnosis and Outcome Prediction

    Authors: Numan Saeed, Salma Hassan, Shahad Hardan, Ahmed Aly, Darya Taratynova, Umair Nawaz, Ufaq Khan, Muhammad Ridzuan, Vincent Andrearczyk, Adrien Depeursinge, Yutong Xie, Thomas Eugene, Raphaël Metz, Mélanie Dore, Gregory Delpon, Vijay Ram Kumar Papineni, Kareem Wahid, Cem Dede, Alaa Mohamed Shawky Ali, Carlos Sjogreen, Mohamed Naser, Clifton D. Fuller, Valentin Oreiller, Mario Jreige, John O. Prior , et al. (18 additional authors not shown)

    Abstract: We present a publicly available multimodal dataset for head and neck cancer research, comprising 1123 annotated Positron Emission Tomography/Computed Tomography (PET/CT) studies from patients with histologically confirmed disease, acquired from 10 international medical centers. All studies contain co-registered PET/CT scans with varying acquisition protocols, reflecting real-world clinical diversi… ▽ More

    Submitted 20 September, 2025; v1 submitted 30 August, 2025; originally announced September 2025.

    Comments: 10 pages, 5 figures. Numan Saeed is the corresponding author. Numan Saeed, Salma Hassan and Shahad Hardan contributed equally to this work. Project page: https://hecktor25.grand-challenge.org/

  45. arXiv:2509.00108  [pdf

    cs.CV

    Dual-Stage Global and Local Feature Framework for Image Dehazing

    Authors: Anas M. Ali, Anis Koubaa, Bilel Benjdira

    Abstract: Addressing the challenge of removing atmospheric fog or haze from digital images, known as image dehazing, has recently gained significant traction in the computer vision community. Although contemporary dehazing models have demonstrated promising performance, few have thoroughly investigated high-resolution imagery. In such scenarios, practitioners often resort to downsampling the input image or… ▽ More

    Submitted 28 August, 2025; originally announced September 2025.

  46. arXiv:2508.18735  [pdf, ps, other

    eess.SP cs.AI

    SkyTrust: Blockchain-Enhanced UAV Security for NTNs with Dynamic Trust and Energy-Aware Consensus

    Authors: Afan Ali, Irfanullah Khan

    Abstract: Non-Terrestrial Networks (NTNs) based on Unmanned Aerial Vehicles (UAVs) as base stations are extremely susceptible to security attacks due to their distributed and dynamic nature, which makes them vulnerable to rogue nodes. In this paper, a new Dynamic Trust Score Adjustment Mechanism with Energy-Aware Consensus (DTSAM-EAC) is proposed to enhance security in UAV-based NTNs. The proposed framework… ▽ More

    Submitted 26 August, 2025; originally announced August 2025.

    Comments: 6 pages, 7 figures

  47. arXiv:2508.13710  [pdf

    eess.IV cs.CR cs.LG cs.MM

    Optimizing Region of Interest Selection for Effective Embedding in Video Steganography Based on Genetic Algorithms

    Authors: Nizheen A. Ali, Ramadhan J. Mstafa

    Abstract: With the widespread use of the internet, there is an increasing need to ensure the security and privacy of transmitted data. This has led to an intensified focus on the study of video steganography, which is a technique that hides data within a video cover to avoid detection. The effectiveness of any steganography method depends on its ability to embed data without altering the original video qual… ▽ More

    Submitted 19 August, 2025; originally announced August 2025.

    Comments: 19 Pages, 7 Figures, 4 Tables

    Journal ref: Computer Systems Science and Engineering 2023, 47(2), 1451-1469

  48. arXiv:2508.11095  [pdf, ps, other

    cs.CR

    HEIR: A Universal Compiler for Homomorphic Encryption

    Authors: Asra Ali, Jaeho Choi, Bryant Gipson, Shruthi Gorantala, Jeremy Kun, Wouter Legiest, Lawrence Lim, Alexander Viand, Meron Zerihun Demissie, Hongren Zheng

    Abstract: This work presents Homomorphic Encryption Intermediate Representation (HEIR), a unified approach to building homomorphic encryption (HE) compilers. HEIR aims to support all mainstream techniques in homomorphic encryption, integrate with all major software libraries and hardware accelerators, and advance the field by providing a platform for research and benchmarking. Built on the MLIR compiler fra… ▽ More

    Submitted 14 August, 2025; originally announced August 2025.

  49. arXiv:2508.04581  [pdf, ps, other

    cs.CL cs.AI cs.LG

    Share Your Attention: Transformer Weight Sharing via Matrix-based Dictionary Learning

    Authors: Magauiya Zhussip, Dmitriy Shopkhoev, Ammar Ali, Stamatios Lefkimmiatis

    Abstract: Large language models (LLMs) have revolutionized AI applications, yet their high computational and memory demands hinder their widespread deployment. Existing compression techniques focus on intra-block optimizations (e.g. low-rank approximation, attention head pruning), while the repetitive layered structure of transformers implies significant inter-block redundancy - a dimension largely unexplor… ▽ More

    Submitted 6 August, 2025; originally announced August 2025.

  50. arXiv:2508.01958  [pdf, other

    cs.ET quant-ph

    Introduction to QUDO, Tensor QUDO and HOBO formulations: Qudits, Equivalences, Knapsack Problem, Traveling Salesman Problem and Combinatorial Games

    Authors: Alejandro Mata Ali

    Abstract: In this paper, we present a brief review and introduction to Quadratic Unconstrained D-ary Optimization (QUDO), Tensor Quadratic Unconstrained D-ary Optimization (T-QUDO) and Higher-Order Unconstrained Binary Optimization (HOBO) formulations for combinatorial optimization problems. We also show their equivalences. To help their understanding, we make some examples for the knapsack problem, traveli… ▽ More

    Submitted 31 March, 2025; originally announced August 2025.

    Comments: 18 pages, 5 figures

    MSC Class: 90C27; 90C20; 81Q99 ACM Class: G.1.6; G.2.1