A machine learning library for Clojure built on top of Weka, providing filters, classifiers, regression, and clustering algorithms.
clj-ml is a machine learning library for Clojure that provides a functional wrapper around the Weka toolkit. It allows developers to perform tasks like classification, regression, clustering, and data preprocessing using Clojure's expressive syntax and data structures, making advanced ML algorithms accessible within the Clojure ecosystem.
Clojure developers and data scientists who need to integrate machine learning into their applications without leaving the Clojure environment, and those familiar with Weka who want a more functional interface.
It offers a seamless bridge between Clojure and Weka, providing an idiomatic Clojure API for a wide range of proven ML algorithms, eliminating the need to write Java interop code directly and enabling faster experimentation and integration.
A machine learning library for Clojure built on top of Weka and friends
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
Wraps Weka's wide range of filters, classifiers, regression models, and clusterers, including decision trees, SVMs, and k-Means, as listed in the README's supported algorithms section.
Provides functional interfaces using Clojure data structures like maps and vectors, as shown in examples like dataset manipulation and instance conversion, making it natural for Clojure developers.
Supports loading and saving datasets in ARFF and CSV formats from local and remote files, demonstrated in the I/O examples with load-instances and save-instances functions.
Allows serialization of trained classifiers to disk and reloading via serialize-to-file and deserialize-from-file, enabling reuse of models without retraining.
Inherits Weka's weaknesses, such as lack of modern deep learning algorithms and potential performance bottlenecks with large, in-memory datasets, as it's a wrapper rather than a native implementation.
The README highlights issues with word attribute consistency in text classification, requiring careful handling of training and testing sets to avoid mismatches in feature extraction.
Requires Java 1.7+ and depends on Weka, which may lead to compatibility issues with other JVM libraries or require specific JVM configurations, adding setup complexity.
API documentation is linked but minimal; advanced usage relies on Weka's docs, and the README examples are basic, potentially hindering learning and troubleshooting for complex tasks.