Open-Awesome
CategoriesAlternativesStacksSelf-HostedExplore
Open-Awesome

© 2026 Open-Awesome. Curated for the developer elite.

TermsPrivacyAboutGitHubRSS
  1. Home
  2. Clojure
  3. postagga

postagga

MITClojure

A Clojure/ClojureScript library for building self-contained natural language parsers using part-of-speech tagging and semantic rules.

GitHubGitHub
162 stars16 forks0 contributors

What is postagga?

Postagga is a natural language processing library written in pure Clojure and ClojureScript. It allows developers to build custom parsers that convert free-form text into structured data using part-of-speech tagging and semantic rules. The library solves the problem of understanding unstructured user input in applications like chatbots or command-line interfaces without relying on external NLP services.

Target Audience

Clojure and ClojureScript developers building chatbots, command interpreters, or applications requiring natural language understanding. It's particularly useful for those needing lightweight, embeddable parsers that run on both servers and browsers.

Value Proposition

Developers choose Postagga because it offers a pure Clojure/ClojureScript solution with no external dependencies, enabling self-contained parsers that are portable and easy to deploy. Its rule-based approach provides fine-grained control over language understanding, unlike black-box NLP APIs.

Overview

A Library to parse natural language in pure Clojure and ClojureScript

Use Cases

Best For

  • Building chatbots that understand free-form user messages in Clojure-based systems
  • Creating command-line tools with natural language interfaces
  • Processing user input in web applications using ClojureScript
  • Extracting structured data from unstructured text in multilingual contexts
  • Developing educational tools for natural language processing experiments
  • Implementing lightweight parsers for domain-specific languages

Not Ideal For

  • Projects requiring cutting-edge deep learning NLP models like transformers for tasks such as sentiment analysis or language generation
  • Teams working in non-Clojure ecosystems (e.g., Python, JavaScript) who need out-of-the-box NLP libraries
  • Applications that demand real-time processing of high-volume text streams without custom optimization

Pros & Cons

Pros

Cross-Platform Self-Containment

Postagga compiles into parsers with zero external dependencies, running seamlessly on both JVM Clojure and browser ClojureScript, as highlighted in the README's emphasis on portability.

Customizable Rule-Based Parsing

Developers can define precise semantic rules as state machines to extract structured data from tagged sentences, offering fine-grained control over language understanding, evidenced by the detailed rule examples in the parser section.

Pre-Trained Multilingual Models

Includes ready-to-use models for English and French derived from annotated corpora like Framenet and Free French Treebank, accessible via namespaces for easy embedding in ClojureScript projects.

Dictionary Patching for Accuracy

Enhances part-of-speech tagging by patching unknown words with custom dictionaries (e.g., for proper nouns), improving reliability as described in the patching workflow section.

Cons

Convoluted Rule Syntax

Defining parser rules involves complex state-machine concepts with steps, states, and keywords like :get-value and :!OR!, which the README admits can be confusing and error-prone for developers.

Limited Model Support and Licensing

Only provides pre-trained models for English and French from specific corpora with licensing restrictions, and training new models requires annotated data, limiting out-of-the-box usability for other languages.

Performance and Memory Concerns

Models can be large variables that risk memory issues, as warned in the README ('avoid realizing all of them like printing in your REPL!!!'), and tokenizers may lack optimization for production-scale text processing.

Frequently Asked Questions

Quick Stats

Stars162
Forks16
Contributors0
Open Issues11
Last commit5 years ago
CreatedSince 2017

Tags

#part-of-speech-tagging#bots#clojurescript#chatbots#viterbi-algorithm#natural-language-processing#hidden-markov-models#clojure#parser-generator#parser

Built With

C
Clojure
C
ClojureScript

Included in

Clojure2.8k
Auto-fetched 1 day ago

Related Projects

lmgreplmgrep

Grep-like utility based on Lucene Monitor compiled with GraalVM native-image

Stars200
Forks5
Last commit1 year ago
Community-curated · Updated weekly · 100% open source

Found a gem we're missing?

Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.

Submit a projectStar on GitHub