A whole-slide foundation model for digital pathology, pre-trained on real-world data to analyze tissue slides at tile and slide levels.
Prov-GigaPath is a foundation model for digital pathology that processes whole-slide images (WSIs) to extract features at both tile (patch) and slide levels. It is pre-trained on a large dataset of real-world pathology slides to provide a robust backbone for various computational pathology tasks. The model helps researchers accelerate AI development in pathology by offering pre-trained encoders that can be fine-tuned for specific diagnostic or analytical applications.
AI researchers and computational pathologists working on digital pathology, whole-slide image analysis, and medical imaging foundation models. It is also suitable for academics and industry professionals focused on reproducibility and building upon state-of-the-art pathology AI research.
Developers choose Prov-GigaPath because it is one of the few open-source foundation models specifically designed for whole-slide pathology images, pre-trained on extensive real-world data. Its dual encoder architecture allows flexible use for both tile-level and slide-level tasks, and it comes with ready-to-use fine-tuning examples, making it a practical starting point for pathology AI projects.
Prov-GigaPath: A whole-slide foundation model for digital pathology from real-world data
Pre-trained on a large-scale dataset of de-identified pathology slides, providing robust feature extraction for digital pathology tasks, as highlighted in the key features.
Includes separate tile and slide encoders for both patch-level and whole-slide analysis, enabling flexible use in various pathology AI pipelines, as shown in the model overview.
Offers scripts and pre-extracted embeddings for datasets like PCam and PANDA, accelerating research with reproducible fine-tuning workflows, detailed in the fine-tuning section.
Provides notebooks for dimensionality reduction and embedding visualization, aiding interpretability and model analysis, as showcased in the news section with a PCA visualization notebook.
Explicitly not intended for clinical or deployed use, limiting applications to research only, as stated in the out-of-scope use and usage notices sections.
Requires NVIDIA A100 GPUs and handles large datasets (e.g., 32GB embeddings for PANDA), making it inaccessible for teams with limited hardware or storage.
Involves complex steps like HuggingFace token setup, environment configuration with conda, and version compatibility issues (e.g., timm>=1.0.3), which can be cumbersome for new users.
Deep probabilistic analysis of single-cell and spatial omics data
scGPT is a foundation model designed for single-cell multi-omics data analysis using generative AI. It leverages transformer architecture pretrained on millions of single-cell profiles to enable a wide range of downstream biological tasks, advancing computational biology by providing a powerful, unified model for cellular data. ## Key Features - **Pretrained Model Zoo** — Offers multiple organ-specific and whole-human models trained on millions of cells for various applications. - **Zero-Shot Applications** — Supports tasks like cell embedding and reference mapping without task-specific training. - **Reference Mapping** — Enables fast similarity search across millions of cells using efficient indexing with faiss. - **Multi-Task Fine-Tuning** — Can be adapted for scRNA-seq integration, cell type annotation, perturbation prediction, and GRN inference. - **Online Tools** — Provides accessible web applications for reference mapping, cell annotation, and GRN inference via cloud GPUs. ## Philosophy scGPT aims to build a foundational AI model for single-cell biology, democratizing access to advanced computational methods and accelerating discoveries in multi-omics research through open-source collaboration.
Pathology Foundation Model - Nature Medicine
Vision-Language Pathology Foundation Model - Nature Medicine
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.