Detect the language of text with support for up to 419 languages, more than any other library.
Franc is a natural language detection library that identifies the language of a given text string. It solves the problem of automatically determining the language of user-generated or document content, which is essential for applications like translation services, content filtering, and data categorization. Unlike many alternatives, it supports an exceptionally wide range of languages—up to 419—making it the most comprehensive open-source solution available.
Developers and data scientists building applications that process or analyze multilingual text, such as translation tools, content management systems, chatbots, or research platforms dealing with diverse linguistic data.
Developers choose Franc for its unmatched language coverage, configurable detection options, and permissive MIT licensing. Its ability to handle many languages with a simple API and provide detailed confidence scores makes it a versatile choice for global applications.
Natural language detection
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
Supports up to 419 languages, the most of any library, making it ideal for global applications like translation tools or content moderation with diverse data.
Allows filtering results by allowing or ignoring specific languages and adjusting the minimum text length, providing flexibility for different use cases.
Returns ranked lists of possible matches with scores via francAll, enabling nuanced analysis beyond a single guess for applications requiring probability estimates.
Works in Node.js, Deno, modern browsers, and includes a CLI, ensuring easy integration into various workflows and environments.
Uses the MIT license, derived from earlier works with permissions, allowing free use and modification without restrictive terms.
Admits in the README that it's easily confused on small samples, requiring longer documents (default 10+ characters) for reliable detection, which limits use cases like social media monitoring.
Returns only three-letter ISO 639-3 codes, not the more common ISO 639-1 codes, necessitating extra steps for systems that expect different standards, as noted in the documentation.
The extensive language support can lead to slower detection times and higher memory usage compared to libraries optimized for speed with fewer languages.
Available only as ES modules, with no default export or CommonJS support, which may complicate integration in older Node.js projects or specific build setups.