Fast Python Collaborative Filtering for Implicit Datasets
This project provides fast Python implementations of several different popular recommendation algorithms for implicit feedback datasets:
- Alternating Least Squares as described in the papers Collaborative Filtering for Implicit Feedback Datasets and in Applications of the Conjugate Gradient Method for Implicit Feedback Collaborative Filtering.
- Bayesian Personalized Ranking
- Item-Item Nearest Neighbour models, using Cosine, TFIDF or BM25 as a distance metric
All models have multi-threaded training routines, using Cython and OpenMP to fit the models in parallel among all available CPU cores. In addition, the ALS and BPR models both have custom CUDA kernels - enabling fitting on compatible GPU’s. This library also supports using approximate nearest neighbours libraries such as Annoy, NMSLIB and Faiss for speeding up making recommendations.
pip install implicit
import implicit # initialize a model model = implicit.als.AlternatingLeastSquares(factors=50) # train the model on a sparse matrix of item/user/confidence weights model.fit(item_user_data) # recommend items for a user user_items = item_user_data.T.tocsr() recommendations = model.recommend(userid, user_items) # find related items related = model.similar_items(itemid)
Articles about Implicit¶
These blog posts describe the algorithms that power this library:
There are also several other blog posts about using Implicit to build recommendation systems:
This library requires SciPy version 0.16 or later. Running on OSX requires an OpenMP compiler,
which can be installed with homebrew:
brew install gcc.