Open-Awesome
CategoriesAlternativesStacksSelf-HostedExplore
Open-Awesome

© 2026 Open-Awesome. Curated for the developer elite.

TermsPrivacyAboutGitHubRSS
  1. Home
  2. Data Science
  3. pyFM

pyFM

Python

A Python implementation of Factorization Machines for recommendation and classification tasks using stochastic gradient descent with adaptive regularization.

GitHubGitHub
925 stars304 forks0 contributors

What is pyFM?

pyFM is a Python library that implements Factorization Machines, a model class used for supervised learning tasks like recommendation systems and classification. It estimates interactions between categorical variables in high-dimensional sparse data by combining feature engineering with factorization techniques. The library uses stochastic gradient descent with adaptive regularization as its learning method.

Target Audience

Data scientists and machine learning engineers building recommendation systems, click-through rate prediction models, or any application requiring modeling of feature interactions in sparse datasets.

Value Proposition

Developers choose pyFM for its straightforward implementation of Factorization Machines in Python, seamless integration with scikit-learn workflows, and adaptive regularization that automates hyperparameter tuning during training.

Overview

Factorization machines in python

Use Cases

Best For

  • Building recommendation systems with implicit or explicit user feedback
  • Click-through rate (CTR) prediction in online advertising
  • Modeling feature interactions in high-dimensional categorical data
  • Academic research or prototyping with Factorization Machines
  • Extending scikit-learn pipelines with factorization-based models
  • Handling cold-start problems in collaborative filtering

Not Ideal For

  • Projects with dense, low-dimensional tabular data where linear models suffice without feature interactions
  • Real-time inference systems requiring sub-millisecond prediction latency
  • Teams needing deep learning models for complex non-linear patterns beyond factorization
  • Applications where full model interpretability and feature importance scores are critical

Pros & Cons

Pros

Adaptive Regularization Automation

Implements stochastic gradient descent with adaptive regularization that automatically adjusts during training, preventing overfitting without manual hyperparameter tuning, as evidenced by the training logs showing decreasing MSE.

Seamless scikit-learn Integration

Designed to work with scikit-learn's DictVectorizer for easy feature encoding from dictionary data, simplifying preprocessing and fitting into existing Python machine learning workflows.

Flexible Task Support

Supports both regression and classification tasks with configurable parameters, demonstrated in the README examples for rating prediction and binary classification.

Efficient Sparse Data Handling

Accepts categorical and real-valued features transformed into sparse matrices via DictVectorizer, mimicking libFM's approach for high-dimensional sparse data common in recommendation systems.

Cons

Limited Performance Optimizations

No mention of GPU support, multi-threading, or advanced optimizations, making it potentially slower for large-scale datasets compared to C++ libraries like libFM.

Basic Documentation and Examples

The README provides only toy and basic real-world examples; advanced usage, hyperparameter tuning guidance, and production deployment tips are lacking.

Dependency on Specific Data Format

Requires data to be converted to dictionary format for DictVectorizer, adding an extra preprocessing step if data is already in numpy arrays or pandas DataFrames.

Frequently Asked Questions

Quick Stats

Stars925
Forks304
Contributors0
Open Issues31
Last commit5 years ago
CreatedSince 2012

Tags

#python-library#collaborative-filtering#sgd#recommendation-system#factorization-machines#machine-learning

Built With

s
scikit-learn
P
Python
N
NumPy

Included in

Data Science3.4k
Auto-fetched 1 hour ago

Related Projects

ThunderSVMThunderSVM

ThunderSVM: A Fast SVM Library on GPUs and CPUs

Stars1,624
Forks222
Last commit2 years ago
fastFMfastFM

fastFM: A Library for Factorization Machines

Stars1,087
Forks205
Last commit3 years ago
tffmtffm

TensorFlow implementation of an arbitrary order Factorization Machine

Stars778
Forks173
Last commit4 years ago
scikit-rvmscikit-rvm

Relevance Vector Machine implementation using the scikit-learn API.

Stars237
Forks75
Last commit10 months ago
Community-curated · Updated weekly · 100% open source

Found a gem we're missing?

Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.

Submit a projectStar on GitHub