A medical text mining and information extraction framework built on spaCy for rapid prototyping and training of predictive NLP models.
MedaCy is a medical text mining and information extraction framework built on spaCy. It enables rapid prototyping, training, and application of highly predictive NLP models for clinical and biomedical text, streamlining researcher workflows and ensuring replicability.
Researchers, data scientists, and developers working on medical NLP projects, particularly those focused on clinical note analysis, systematic review information extraction, and biomedical text mining.
Developers choose MedaCy for its out-of-the-box, highly predictive models, customizable pipelines, and emphasis on replicability, making it ideal for accelerating medical NLP research and deployment.
:hospital: Medical Text Mining and Information Extraction with spaCy
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
Offers out-of-the-box trained models that dominate medical named entity recognition shared tasks, as evidenced by the example extracting drug, dosage, form, and duration from clinical notes.
Provides detailed development instructions and documentation for building tailored NLP systems, enabling researchers to adapt to specific medical text mining needs.
Designed to ensure system replicability for reproducing results and distributing models while maintaining privacy, supporting reproducible medical NLP workflows.
Maintained by NLP@VCU with ongoing development, and issues are actively addressed through the project's GitHub repository, as noted in the README.
Installation is via pip from GitHub, which lacks formal versioning and can lead to instability, breaking changes, and dependency management challenges compared to PyPI releases.
Primarily optimized for medical text, making it less effective for general NLP tasks without extensive customization, as indicated by its emphasis on clinical and biomedical applications.
Licensed under GNU GPL, which imposes copyleft requirements that may limit use in proprietary or commercial software projects seeking more permissive licensing.