arXiv:2511.12869 [pdf, ps, other]

On the Fundamental Limits of LLMs at Scale

Authors: Muhammad Ahmed Mohsin, Muhammad Umer, Ahsan Bilal, Zeeshan Memon, Muhammad Ibtsaam Qadir, Sagnik Bhattacharya, Hassan Rizwan, Abhiram R. Gorle, Maahe Zehra Kazmi, Ayesha Mohsin, Muhammad Usman Rafique, Zihao He, Pulkit Mehta, Muhammad Ali Jamshed, John M. Cioffi

Abstract: Large Language Models (LLMs) have benefited enormously from scaling, yet these gains are bounded by five fundamental limitations: (1) hallucination, (2) context compression, (3) reasoning degradation, (4) retrieval fragility, and (5) multimodal misalignment. While existing surveys describe these phenomena empirically, they lack a rigorous theoretical synthesis connecting them to the foundational l… ▽ More Large Language Models (LLMs) have benefited enormously from scaling, yet these gains are bounded by five fundamental limitations: (1) hallucination, (2) context compression, (3) reasoning degradation, (4) retrieval fragility, and (5) multimodal misalignment. While existing surveys describe these phenomena empirically, they lack a rigorous theoretical synthesis connecting them to the foundational limits of computation, information, and learning. This work closes that gap by presenting a unified, proof-informed framework that formalizes the innate theoretical ceilings of LLM scaling. First, computability and uncomputability imply an irreducible residue of error: for any computably enumerable model family, diagonalization guarantees inputs on which some model must fail, and undecidable queries (e.g., halting-style tasks) induce infinite failure sets for all computable predictors. Second, information-theoretic and statistical constraints bound attainable accuracy even on decidable tasks, finite description length enforces compression error, and long-tail factual knowledge requires prohibitive sample complexity. Third, geometric and computational effects compress long contexts far below their nominal size due to positional under-training, encoding attenuation, and softmax crowding. We further show how likelihood-based training favors pattern completion over inference, how retrieval under token limits suffers from semantic drift and coupling noise, and how multimodal scaling inherits shallow cross-modal alignment. Across sections, we pair theorems and empirical evidence to outline where scaling helps, where it saturates, and where it cannot progress, providing both theoretical foundations and practical mitigation paths like bounded-oracle retrieval, positional curricula, and sparse or hierarchical attention. △ Less

Submitted 16 November, 2025; originally announced November 2025.

Comments: Submitted to TMLR 2025

arXiv:2211.15790 [pdf, other]

Handling Image and Label Resolution Mismatch in Remote Sensing

Authors: Scott Workman, Armin Hadzic, M. Usman Rafique

Abstract: Though semantic segmentation has been heavily explored in vision literature, unique challenges remain in the remote sensing domain. One such challenge is how to handle resolution mismatch between overhead imagery and ground-truth label sources, due to differences in ground sample distance. To illustrate this problem, we introduce a new dataset and use it to showcase weaknesses inherent in existing… ▽ More Though semantic segmentation has been heavily explored in vision literature, unique challenges remain in the remote sensing domain. One such challenge is how to handle resolution mismatch between overhead imagery and ground-truth label sources, due to differences in ground sample distance. To illustrate this problem, we introduce a new dataset and use it to showcase weaknesses inherent in existing strategies that naively upsample the target label to match the image resolution. Instead, we present a method that is supervised using low-resolution labels (without upsampling), but takes advantage of an exemplar set of high-resolution labels to guide the learning process. Our method incorporates region aggregation, adversarial learning, and self-supervised pretraining to generate fine-grained predictions, without requiring high-resolution annotations. Extensive experiments demonstrate the real-world applicability of our approach. △ Less

Submitted 28 November, 2022; originally announced November 2022.

Comments: IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2023

arXiv:2204.01807 [pdf, other]

Revisiting Near/Remote Sensing with Geospatial Attention

Authors: Scott Workman, M. Usman Rafique, Hunter Blanton, Nathan Jacobs

Abstract: This work addresses the task of overhead image segmentation when auxiliary ground-level images are available. Recent work has shown that performing joint inference over these two modalities, often called near/remote sensing, can yield significant accuracy improvements. Extending this line of work, we introduce the concept of geospatial attention, a geometry-aware attention mechanism that explicitl… ▽ More This work addresses the task of overhead image segmentation when auxiliary ground-level images are available. Recent work has shown that performing joint inference over these two modalities, often called near/remote sensing, can yield significant accuracy improvements. Extending this line of work, we introduce the concept of geospatial attention, a geometry-aware attention mechanism that explicitly considers the geospatial relationship between the pixels in a ground-level image and a geographic location. We propose an approach for computing geospatial attention that incorporates geometric features and the appearance of the overhead and ground-level imagery. We introduce a novel architecture for near/remote sensing that is based on geospatial attention and demonstrate its use for five segmentation tasks. The results demonstrate that our method significantly outperforms the previous state-of-the-art methods. △ Less

Submitted 4 April, 2022; originally announced April 2022.

Comments: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022

arXiv:2012.00119 [pdf, other]

doi 10.1007/978-3-030-66415-2_23

Dynamic Image for 3D MRI Image Alzheimer's Disease Classification

Authors: Xin Xing, Gongbo Liang, Hunter Blanton, Muhammad Usman Rafique, Chris Wang, Ai-Ling Lin, Nathan Jacobs

Abstract: We propose to apply a 2D CNN architecture to 3D MRI image Alzheimer's disease classification. Training a 3D convolutional neural network (CNN) is time-consuming and computationally expensive. We make use of approximate rank pooling to transform the 3D MRI image volume into a 2D image to use as input to a 2D CNN. We show our proposed CNN model achieves $9.5\%$ better Alzheimer's disease classificat… ▽ More We propose to apply a 2D CNN architecture to 3D MRI image Alzheimer's disease classification. Training a 3D convolutional neural network (CNN) is time-consuming and computationally expensive. We make use of approximate rank pooling to transform the 3D MRI image volume into a 2D image to use as input to a 2D CNN. We show our proposed CNN model achieves $9.5\%$ better Alzheimer's disease classification accuracy than the baseline 3D models. We also show that our method allows for efficient training, requiring only 20% of the training time compared to 3D CNN models. The code is available online: https://github.com/UkyVision/alzheimer-project. △ Less

Submitted 30 November, 2020; originally announced December 2020.

Comments: Accepted to ECCV2020 Workshop on BioImage Computing

arXiv:2008.10000 [pdf]

Mobile Robot Path Planning in Static Environments using Particle Swarm Optimization

Authors: M. Shahab Alam, M. Usman Rafique, M. Umer Khan

Abstract: Motion planning is a key element of robotics since it empowers a robot to navigate autonomously. Particle Swarm Optimization is a simple, yet a very powerful optimization technique which has been effectively used in many complex multi-dimensional optimization problems. This paper proposes a path planning algorithm based on particle swarm optimization for computing a shortest collision-free path fo… ▽ More Motion planning is a key element of robotics since it empowers a robot to navigate autonomously. Particle Swarm Optimization is a simple, yet a very powerful optimization technique which has been effectively used in many complex multi-dimensional optimization problems. This paper proposes a path planning algorithm based on particle swarm optimization for computing a shortest collision-free path for a mobile robot in environments populated with static convex obstacles. The proposed algorithm finds the optimal path by performing random sampling on grid lines generated between the robot start and goal positions. Functionality of the proposed algorithm is illustrated via simulation results for different scenarios. △ Less

Submitted 23 August, 2020; originally announced August 2020.

arXiv:2007.15144 [pdf, other]

Single Image Cloud Detection via Multi-Image Fusion

Authors: Scott Workman, M. Usman Rafique, Hunter Blanton, Connor Greenwell, Nathan Jacobs

Abstract: Artifacts in imagery captured by remote sensing, such as clouds, snow, and shadows, present challenges for various tasks, including semantic segmentation and object detection. A primary challenge in developing algorithms for identifying such artifacts is the cost of collecting annotated training data. In this work, we explore how recent advances in multi-image fusion can be leveraged to bootstrap… ▽ More Artifacts in imagery captured by remote sensing, such as clouds, snow, and shadows, present challenges for various tasks, including semantic segmentation and object detection. A primary challenge in developing algorithms for identifying such artifacts is the cost of collecting annotated training data. In this work, we explore how recent advances in multi-image fusion can be leveraged to bootstrap single image cloud detection. We demonstrate that a network optimized to estimate image quality also implicitly learns to detect clouds. To support the training and evaluation of our approach, we collect a large dataset of Sentinel-2 images along with a per-pixel semantic labelling for land cover. Through various experiments, we demonstrate that our method reduces the need for annotated training data and improves cloud detection performance. △ Less

Submitted 29 July, 2020; originally announced July 2020.

Comments: IEEE International Geoscience and Remote Sensing Symposium (IGARSS) 2020

arXiv:1810.09528 [pdf, other]

doi 10.1145/3274895.3274934

A Weakly Supervised Approach for Estimating Spatial Density Functions from High-Resolution Satellite Imagery

Authors: Nathan Jacobs, Adam Kraft, Muhammad Usman Rafique, Ranti Dev Sharma

Abstract: We propose a neural network component, the regional aggregation layer, that makes it possible to train a pixel-level density estimator using only coarse-grained density aggregates, which reflect the number of objects in an image region. Our approach is simple to use and does not require domain-specific assumptions about the nature of the density function. We evaluate our approach on several synthe… ▽ More We propose a neural network component, the regional aggregation layer, that makes it possible to train a pixel-level density estimator using only coarse-grained density aggregates, which reflect the number of objects in an image region. Our approach is simple to use and does not require domain-specific assumptions about the nature of the density function. We evaluate our approach on several synthetic datasets. In addition, we use this approach to learn to estimate high-resolution population and housing density from satellite imagery. In all cases, we find that our approach results in better density estimates than a commonly used baseline. We also show how our housing density estimator can be used to classify buildings as residential or non-residential. △ Less

Submitted 22 October, 2018; originally announced October 2018.

Comments: 10 pages, 8 figures. ACM SIGSPATIAL 2018, Seattle, USA

Journal ref: 26th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems (SIGSPATIAL 18), 2018, Seattle, WA, USA

Showing 1–7 of 7 results for author: Rafique, M U