A whole-slide foundation model for digital pathology, pre-trained on real-world data to analyze tissue slides at tile and slide levels.
Prov-GigaPath is a foundation model for digital pathology that processes whole-slide images (WSIs) to extract features at both tile (patch) and slide levels. It is pre-trained on a large dataset of real-world pathology slides to provide a robust backbone for various computational pathology tasks. The model helps researchers accelerate AI development in pathology by offering pre-trained encoders that can be fine-tuned for specific diagnostic or analytical applications.
AI researchers and computational pathologists working on digital pathology, whole-slide image analysis, and medical imaging foundation models. It is also suitable for academics and industry professionals focused on reproducibility and building upon state-of-the-art pathology AI research.
Developers choose Prov-GigaPath because it is one of the few open-source foundation models specifically designed for whole-slide pathology images, pre-trained on extensive real-world data. Its dual encoder architecture allows flexible use for both tile-level and slide-level tasks, and it comes with ready-to-use fine-tuning examples, making it a practical starting point for pathology AI projects.
Prov-GigaPath: A whole-slide foundation model for digital pathology from real-world data
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
Pre-trained on a large-scale dataset of de-identified pathology slides, providing robust feature extraction for digital pathology tasks, as highlighted in the key features.
Includes separate tile and slide encoders for both patch-level and whole-slide analysis, enabling flexible use in various pathology AI pipelines, as shown in the model overview.
Offers scripts and pre-extracted embeddings for datasets like PCam and PANDA, accelerating research with reproducible fine-tuning workflows, detailed in the fine-tuning section.
Provides notebooks for dimensionality reduction and embedding visualization, aiding interpretability and model analysis, as showcased in the news section with a PCA visualization notebook.
Explicitly not intended for clinical or deployed use, limiting applications to research only, as stated in the out-of-scope use and usage notices sections.
Requires NVIDIA A100 GPUs and handles large datasets (e.g., 32GB embeddings for PANDA), making it inaccessible for teams with limited hardware or storage.
Involves complex steps like HuggingFace token setup, environment configuration with conda, and version compatibility issues (e.g., timm>=1.0.3), which can be cumbersome for new users.