A Python package for processing and normalizing high-dimensional morphological feature data from high-throughput cell imaging experiments.
Pycytominer is a Python package that processes and normalizes high-dimensional morphological feature data extracted from high-throughput cell imaging experiments. It transforms raw single-cell readouts into reproducible, analysis-ready profiles suitable for downstream machine learning and statistical analysis. The tool is specifically designed to handle data from platforms like CellProfiler and DeepProfiler.
Bioinformaticians, computational biologists, and researchers working with high-throughput cell imaging data who need to standardize and prepare morphological feature profiles for analysis.
Developers choose Pycytominer for its consistent, simple API tailored to image-based profiling workflows, its integration with popular tools like CellProfiler and CytoTable, and its focus on reproducibility in processing high-dimensional cell morphology data.
Python package for image-based profiling bioinformatics
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
All core functions like aggregate and normalize use a uniform interface with consistent arguments, making the workflow predictable and easy to script, as highlighted in the API section.
Defaults to CellProfiler's data structure with Metadata_ prefixes and compartment names, reducing setup time for standard experiments, as described in the CellProfiler support section.
Supports modern formats like parquet, compressed CSV, and AnnData, enabling efficient storage and integration with other bioinformatics tools, mentioned in the Frameworks section.
Seamlessly works with CytoTable for data preparation and other cytomining projects like Profiling-recipe, enhancing reproducibility and pipeline compatibility.
Users must rely on external frameworks like Profiling-recipe or CytoSnake for end-to-end workflows, as Pycytominer only provides standalone functions, adding complexity for full automation.
When using tools other than CellProfiler, features must be manually defined in functions like normalize, increasing error risk and setup time, as noted in the Handling inputs section.
Processing SQLite files is limited by available memory and CPU, and the README admits this dependency can be a bottleneck for large datasets.