Open-Awesome
CategoriesAlternativesStacksSelf-HostedExplore
Open-Awesome

© 2026 Open-Awesome. Curated for the developer elite.

TermsPrivacyAboutGitHubRSS
  1. Home
  2. Go
  3. kagome

kagome

MITGov2.11.0

A self-contained Japanese morphological analyzer written in pure Go, tokenizing text into words and analyzing parts of speech.

GitHubGitHub
965 stars59 forks0 contributors

What is kagome?

Kagome is an open-source Japanese morphological analyzer written in pure Go. It segments Japanese text into words and performs part-of-speech tagging, providing accurate linguistic analysis essential for natural language processing tasks like search indexing, text analysis, and language learning tools.

Target Audience

Developers and researchers working on Japanese natural language processing (NLP) applications, such as search engines, text analysis tools, language learning platforms, and linguistic research software.

Value Proposition

Developers choose Kagome for its self-contained binaries with embedded dictionaries, eliminating external dependencies, and its versatility with multiple deployment options including a RESTful API server, WebAssembly for browsers, and FFI support for integration with languages like Python and PHP.

Overview

Self-contained Japanese Morphological Analyzer written in pure Go

Use Cases

Best For

  • Building Japanese search engines that require optimized tokenization with search or extended segmentation modes.
  • Developing language learning tools that need client-side Japanese text analysis via WebAssembly in web browsers.
  • Creating scalable microservices for Japanese NLP using its production-ready RESTful API server mode.
  • Integrating Japanese morphological analysis into Python or PHP applications via its C library FFI.
  • Deploying Japanese tokenization services in containerized environments using its lightweight Docker images.
  • Conducting linguistic research or text analysis that requires accurate part-of-speech tagging and word segmentation.

Not Ideal For

  • Projects requiring the latest Japanese slang or neologisms via frequently updated dictionaries like mecab-ipadic-NEologd, which Kagome only supports experimentally.
  • Teams developing in languages not covered by FFI, such as Ruby or .NET, without resources to create custom bindings.
  • Applications needing advanced NLP features beyond tokenization and POS tagging, like named entity recognition or dependency parsing.
  • Environments where browser-based tokenization must avoid any external dependencies, as the WebAssembly demo requires Graphviz for visualization.

Pros & Cons

Pros

Self-Contained Binaries

Embeds dictionaries like MeCab-IPADIC and UniDic directly in the binary, eliminating external dependencies for easy deployment across platforms.

Multiple Segmentation Modes

Offers normal, search, and extended modes to tailor tokenization for specific use cases, such as search optimization with heuristic-based splitting.

Versatile Deployment Options

Includes a RESTful API server for scalable microservices, WebAssembly for client-side browser use, and FFI for Python/PHP integration, supporting diverse architectures.

Comprehensive Documentation

Provides extensive examples, a Japanese reference manual, and a community wiki, aiding rapid onboarding and troubleshooting.

Cons

Limited Dictionary Variety

Default dictionaries are static and older; experimental support for NEologd or Korean lacks maturity, potentially missing contemporary or specialized vocabulary.

Narrow FFI Language Support

C library API is only tested with Python and PHP, leaving integration with popular languages like JavaScript/Node.js or Rust to community effort or workarounds.

WebAssembly Dependencies

The browser demo requires Graphviz for lattice visualization, adding setup complexity compared to purely self-contained client-side solutions.

Frequently Asked Questions

Quick Stats

Stars965
Forks59
Contributors0
Open Issues1
Last commit5 days ago
CreatedSince 2014

Tags

#part-of-speech-tagging#nlp-library#hacktoberfest#webassembly#pos-tagging#rest-api#ffi#go-library#tokenization#self-contained#docker#tokenizer#segmentation

Built With

G
Go
W
WebAssembly
D
Docker

Included in

Go169.1k
Auto-fetched 5 hours ago

Related Projects

spaGOspaGO

Self-contained Machine Learning and Natural Language Processing library in Go

Stars1,850
Forks89
Last commit1 year ago
Community-curated · Updated weekly · 100% open source

Found a gem we're missing?

Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.

Submit a projectStar on GitHub