Computer Science > Computer Vision and Pattern Recognition

arXiv:2603.25565 (cs)

[Submitted on 26 Mar 2026]

Title:GeoHeight-Bench: Towards Height-Aware Multimodal Reasoning in Remote Sensing

Authors:Xuran Hu, Zhitong Xiong, Zhongcheng Hong, Yifang Ban, Xiaoxiang Zhu, Wufan Zhao

Abstract:Current Large Multimodal Models (LMMs) in Earth Observation typically neglect the critical "vertical" dimension, limiting their reasoning capabilities in complex remote sensing geometries and disaster scenarios where physical spatial structures often outweigh planar visual textures. To bridge this gap, we introduce a comprehensive evaluation framework dedicated to height-aware remote sensing understanding. First, to overcome the severe scarcity of annotated data, we develop a scalable, VLM-driven data generation pipeline utilizing systematic prompt engineering and metadata extraction. This pipeline constructs two complementary benchmarks: GeoHeight-Bench for relative height analysis, and a more challenging GeoHeight-Bench+ for holistic, terrain-aware reasoning. Furthermore, to validate the necessity of height perception, we propose GeoHeightChat, the first height-aware remote sensing LMM baseline. Serving as a strong proof of concept, our baseline demonstrates that synergizing visual semantics with implicitly injected height geometric features effectively mitigates the "vertical blind spot", successfully unlocking a new paradigm of interactive height reasoning in existing optical models.

Comments:	18 pages, 4 figures
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
ACM classes:	I.2.10
Cite as:	arXiv:2603.25565 [cs.CV]
	(or arXiv:2603.25565v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2603.25565

Submission history

From: Xuran Hu [view email]
[v1] Thu, 26 Mar 2026 15:38:02 UTC (1,013 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:GeoHeight-Bench: Towards Height-Aware Multimodal Reasoning in Remote Sensing

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:GeoHeight-Bench: Towards Height-Aware Multimodal Reasoning in Remote Sensing

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators