Computer Science > Computer Vision and Pattern Recognition

arXiv:2502.07221 (cs)

[Submitted on 11 Feb 2025 (v1), last revised 21 Dec 2025 (this version, v3)]

Title:HOMIE: Histopathology Omni-modal Embedding for Pathology Composed Retrieval

Authors:Qifeng Zhou, Wenliang Zhong, Thao M. Dang, Hehuan Ma, Saiyang Na, Yuzhi Guo, Junzhou Huang

Abstract:The integration of Artificial Intelligence (AI) into pathology faces a fundamental challenge: black-box predictive models lack transparency, while generative approaches risk clinical hallucination. A case-based retrieval paradigm offers a more interpretable alternative for clinical adoption. However, current SOTA models are constrained by dual-encoder architectures that cannot process the composed modality of real-world clinical queries. We formally define the task of Pathology Composed Retrieval (PCR). However, progress in this newly defined task is blocked by two critical challenges: (1) Multimodal Large Language Models (MLLMs) offer the necessary deep-fusion architecture but suffer from a critical Task Mismatch and Domain Mismatch. (2) No benchmark exists to evaluate such compositional queries. To solve these challenges, we propose HOMIE, a systematic framework that transforms a general MLLM into a specialized retrieval expert. HOMIE resolves the dual mismatch via a two-stage process: a retrieval-adaptation stage to solve the task mismatch, and a pathology-specific tuning stage, featuring a progressive knowledge curriculum, pathology specfic stain and native resolution processing, to solve the domain mismatch. We also introduce the PCR Benchmark, a benchmark designed to evaluate composed retrieval in pathology. Experiments show that HOMIE, trained only on public data, matches SOTA performance on traditional retrieval tasks and outperforms all baselines on the newly defined PCR task.

Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2502.07221 [cs.CV]
	(or arXiv:2502.07221v3 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2502.07221

Submission history

From: Qifeng Zhou [view email]
[v1] Tue, 11 Feb 2025 03:28:55 UTC (2,243 KB)
[v2] Sun, 16 Mar 2025 20:05:51 UTC (1,110 KB)
[v3] Sun, 21 Dec 2025 21:43:28 UTC (1,208 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:HOMIE: Histopathology Omni-modal Embedding for Pathology Composed Retrieval

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:HOMIE: Histopathology Omni-modal Embedding for Pathology Composed Retrieval

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators