Home  | Publications | WLW+25

CLEAR-WSI: Foundation Model Empowered Whole Slide Image Retrieval

MCML Authors

Abstract

The rapid growth of digital pathology has produced vast repositories of hematoxylin and eosin stained whole slide images, yet most of them remain unindexed or unlabelled, limiting their utility for computational analysis. Reverse image search provides a scalable way to organize and access these archives by retrieving visually similar images. While currently deployed retrieval systems exist, they rely on manual configuration, highly affecting their performance. Thus, we propose CLEAR-WSI, Constant Length Embedding & Automatic Retrieval, a fully automated pathology reverse image search engine that leverages Vision Transformer foundation models for histopathology together with attention-based multiple instance learning (AttentionMIL). The AttentionMIL framework jointly identifies diagnostically relevant whole slide images and predicts slide-level diagnoses. To further improve performance, we introduce a self-reviewing classifier filtering mechanism: retrieved candidates are filtered according to their predicted labels, mostly outperforming class-informed filters. Across two public datasets, CAMELYON16 (lymph node metastases) and BRACS (breast cancer subtypes), our method establishes new state-of-the-art results, improving from 77.49% to 89.92% on CAMELYON16, from 54.12% to 75.86% on BRACS level-1, and from 36.47% to 51.72% on BRACS level-2. Our general-purpose, annotation-free, dataset-agnostic, search engine that scales across diverse data sources is openly available: https://github.com/youssefwally/CLEAR-WSI.

misc WLW+25


Preprint

Nov. 2025

Authors

Y. Wally • J. Liu • E. Wetzer • P. J. Schüffler

Links

URL GitHub

Research Area

 C1 | Medicine

BibTeXKey: WLW+25

Back to Top