How do I install Pycytominer using pip?

Run 'pip install pycytominer' for the base package. For additional features like collate or cell locations, use extras like 'pip install pycytominer[collate]', as detailed in the Installation section.

Pycytominer vs CytoTable: what's the difference?

Pycytominer processes and normalizes feature profiles for analysis, while CytoTable converts and harmonizes raw image analysis outputs into a format ready for Pycytominer. They are complementary tools in the cytomining ecosystem.

How to normalize data with Pycytominer in Python?

Call pycytominer.normalize() with profiles as a DataFrame, specify a method like 'standardize', and use the samples parameter for control groups, as shown in the Usage example with real-world data from CellProfiler.

Does Pycytominer work with deep learning-based feature extraction?

Yes, it supports DeepProfiler outputs through pycytominer.cyto_utils, but may require additional steps for data preparation compared to CellProfiler, and integration is less seamless out-of-the-box.

What should I do if Pycytominer runs out of memory?

For large SQLite files, consider using CytoTable to convert to parquet format or process in chunks, as the README notes resource limitations and recommends this for better performance.

Can I use Pycytominer with In Carta software?

Yes, but you need to pre-harmonize data using CytoTable, as recommended in the Handling inputs section, and manually specify features in functions like normalize to avoid issues.

Open-Awesome

Pycytominer

BSD-3-ClausePythonv1.6.1

A Python package for processing and normalizing high-dimensional morphological feature data from high-throughput cell imaging experiments.

Visit Website GitHub

144 stars41 forks0 contributors

What is Pycytominer?

Pycytominer is a Python package that processes and normalizes high-dimensional morphological feature data extracted from high-throughput cell imaging experiments. It transforms raw single-cell readouts into reproducible, analysis-ready profiles suitable for downstream machine learning and statistical analysis. The tool is specifically designed to handle data from platforms like CellProfiler and DeepProfiler.

Target Audience

Bioinformaticians, computational biologists, and researchers working with high-throughput cell imaging data who need to standardize and prepare morphological feature profiles for analysis.

Value Proposition

Developers choose Pycytominer for its consistent, simple API tailored to image-based profiling workflows, its integration with popular tools like CellProfiler and CytoTable, and its focus on reproducibility in processing high-dimensional cell morphology data.

Overview

Python package for image-based profiling

Use Cases

Best For

Processing morphological features from CellProfiler or DeepProfiler outputs
Normalizing and standardizing high-dimensional cell imaging data
Aggregating single-cell profiles to well or population-level summaries
Selecting informative features for downstream machine learning analysis
Creating consensus signatures from replicate experiments
Preparing data for visualization tools like Morpheus heatmaps

Not Ideal For

Projects using image analysis tools other than CellProfiler or DeepProfiler without CytoTable for data harmonization
Real-time processing workflows where data needs to be analyzed as it's generated from microscopes
Environments with severe memory limitations that cannot handle large single-cell SQLite files

Pros & Cons

Pros

Consistent Processing API

All core functions like aggregate and normalize use a uniform interface with consistent arguments, making the workflow predictable and easy to script, as highlighted in the API section.

Specialized for CellProfiler

Defaults to CellProfiler's data structure with Metadata_ prefixes and compartment names, reducing setup time for standard experiments, as described in the CellProfiler support section.

Flexible Data Formats

Supports modern formats like parquet, compressed CSV, and AnnData, enabling efficient storage and integration with other bioinformatics tools, mentioned in the Frameworks section.

Ecosystem Integration

Seamlessly works with CytoTable for data preparation and other cytomining projects like Profiling-recipe, enhancing reproducibility and pipeline compatibility.

Cons

Pipeline Orchestration Not Included

Users must rely on external frameworks like Profiling-recipe or CytoSnake for end-to-end workflows, as Pycytominer only provides standalone functions, adding complexity for full automation.

Manual Feature Specification for Non-Standard Data

When using tools other than CellProfiler, features must be manually defined in functions like normalize, increasing error risk and setup time, as noted in the Handling inputs section.

Resource Intensive for Large Files

Processing SQLite files is limited by available memory and CPU, and the README admits this dependency can be a bottleneck for large datasets.

Frequently Asked Questions

Related Projects

SimpleITK

SimpleITK: a layer built on top of the Insight Toolkit (ITK), intended to simplify and facilitate ITK's use in rapid prototyping, education and interpreted languages.

ZeroCostDL4Mic: A Google Colab based no-cost toolbox to explore Deep-Learning in Microscopy

Stars648

Forks143

Last commit3 months ago

Neurite

Neural networks toolbox focused on medical image analysis

Stars376

Forks77

Last commit14 days ago

AICSImageIO

Image Reading, Metadata Conversion, and Image Writing for Microscopy Images in Python

Stars222

Forks51

Last commit7 months ago

Community-curated · Updated weekly · 100% open source

Found a gem we're missing?

Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.

Submit a project Star on GitHub