Approximate Alternating Least Squares

This library supports using a couple of different approximate nearest neighbours libraries to speed up the recommend and similar_items methods of the AlternatingLeastSquares model.

The potential speedup of using these methods can be quite significant, at the risk of potentially missing relevant results:

_images/recommendperf.png

See this post comparing the different ANN libraries for more details.

NMSLibAlternatingLeastSquares

class implicit.approximate_als.NMSLibAlternatingLeastSquares(approximate_similar_items=True, approximate_recommend=True, method='hnsw', index_params=None, query_params=None, random_state=None, *args, **kwargs)

Bases: implicit.als.AlternatingLeastSquares

Speeds up the base AlternatingLeastSquares model by using NMSLib to create approximate nearest neighbours indices of the latent factors.

Parameters:
  • method (str, optional) – The NMSLib method to use
  • index_params (dict, optional) – Optional params to send to the createIndex call in NMSLib
  • query_params (dict, optional) – Optional query time params for the NMSLib ‘setQueryTimeParams’ call
  • approximate_similar_items (bool, optional) – whether or not to build an NMSLIB index for computing similar_items
  • approximate_recommend (bool, optional) – whether or not to build an NMSLIB index for the recommend call
  • random_state (int, RandomState or None, optional) – The random state for seeding the initial item and user factors. Default is None.
similar_items_index

NMSLib index for looking up similar items in the cosine space formed by the latent item_factors

Type:nmslib.FloatIndex
recommend_index

NMSLib index for looking up similar items in the inner product space formed by the latent item_factors

Type:nmslib.FloatIndex
fit(Ciu, show_progress=True)

Factorizes the item_users matrix.

After calling this method, the members ‘user_factors’ and ‘item_factors’ will be initialized with a latent factor model of the input data.

The item_users matrix does double duty here. It defines which items are liked by which users (P_iu in the original paper), as well as how much confidence we have that the user liked the item (C_iu).

The negative items are implicitly defined: This code assumes that positive items in the item_users matrix means that the user liked the item. The negatives are left unset in this sparse matrix: the library will assume that means Piu = 0 and Ciu = 1 for all these items. Negative items can also be passed with a higher confidence value by passing a negative value, indicating that the user disliked the item.

Parameters:
  • item_users (csr_matrix) – Matrix of confidences for the liked items. This matrix should be a csr_matrix where the rows of the matrix are the item, the columns are the users that liked that item, and the value is the confidence that the user liked the item.
  • show_progress (bool, optional) – Whether to show a progress bar during fitting
recommend(userid, user_items, N=10, filter_already_liked_items=True, filter_items=None, recalculate_user=False)

Recommends items for a user

Calculates the N best recommendations for a user, and returns a list of itemids, score.

Parameters:
  • userid (int) – The userid to calculate recommendations for
  • user_items (csr_matrix) – A sparse matrix of shape (number_users, number_items). This lets us look up the liked items and their weights for the user. This is used to filter out items that have already been liked from the output, and to also potentially calculate the best items for this user.
  • N (int, optional) – The number of results to return
  • filter_already_liked_items (bool, optional) – When true, don’t return items present in the training set that were rated by the specificed user.
  • filter_items (sequence of ints, optional) – List of extra item ids to filter out from the output
  • recalculate_user (bool, optional) – When true, don’t rely on stored user state and instead recalculate from the passed in user_items
Returns:

List of (itemid, score) tuples

Return type:

list

similar_items(itemid, N=10)

Calculates a list of similar items

Parameters:
  • itemid (int) – The row id of the item to retrieve similar items for
  • N (int, optional) – The number of similar items to return
Returns:

List of (itemid, score) tuples

Return type:

list

AnnoyAlternatingLeastSquares

class implicit.approximate_als.AnnoyAlternatingLeastSquares(approximate_similar_items=True, approximate_recommend=True, n_trees=50, search_k=-1, random_state=None, *args, **kwargs)

Bases: implicit.als.AlternatingLeastSquares

A version of the AlternatingLeastSquares model that uses an Annoy index to calculate similar items and recommend items.

Parameters:
  • n_trees (int, optional) – The number of trees to use when building the Annoy index. More trees gives higher precision when querying.
  • search_k (int, optional) – Provides a way to search more trees at runtime, giving the ability to have more accurate results at the cost of taking more time.
  • approximate_similar_items (bool, optional) – whether or not to build an Annoy index for computing similar_items
  • approximate_recommend (bool, optional) – whether or not to build an Annoy index for the recommend call
  • random_state (int, RandomState or None, optional) – The random state for seeding the initial item and user factors. Default is None.
similar_items_index

Annoy index for looking up similar items in the cosine space formed by the latent item_factors

Type:annoy.AnnoyIndex
recommend_index

Annoy index for looking up similar items in the inner product space formed by the latent item_factors

Type:annoy.AnnoyIndex
fit(Ciu, show_progress=True)

Factorizes the item_users matrix.

After calling this method, the members ‘user_factors’ and ‘item_factors’ will be initialized with a latent factor model of the input data.

The item_users matrix does double duty here. It defines which items are liked by which users (P_iu in the original paper), as well as how much confidence we have that the user liked the item (C_iu).

The negative items are implicitly defined: This code assumes that positive items in the item_users matrix means that the user liked the item. The negatives are left unset in this sparse matrix: the library will assume that means Piu = 0 and Ciu = 1 for all these items. Negative items can also be passed with a higher confidence value by passing a negative value, indicating that the user disliked the item.

Parameters:
  • item_users (csr_matrix) – Matrix of confidences for the liked items. This matrix should be a csr_matrix where the rows of the matrix are the item, the columns are the users that liked that item, and the value is the confidence that the user liked the item.
  • show_progress (bool, optional) – Whether to show a progress bar during fitting
recommend(userid, user_items, N=10, filter_already_liked_items=True, filter_items=None, recalculate_user=False)

Recommends items for a user

Calculates the N best recommendations for a user, and returns a list of itemids, score.

Parameters:
  • userid (int) – The userid to calculate recommendations for
  • user_items (csr_matrix) – A sparse matrix of shape (number_users, number_items). This lets us look up the liked items and their weights for the user. This is used to filter out items that have already been liked from the output, and to also potentially calculate the best items for this user.
  • N (int, optional) – The number of results to return
  • filter_already_liked_items (bool, optional) – When true, don’t return items present in the training set that were rated by the specificed user.
  • filter_items (sequence of ints, optional) – List of extra item ids to filter out from the output
  • recalculate_user (bool, optional) – When true, don’t rely on stored user state and instead recalculate from the passed in user_items
Returns:

List of (itemid, score) tuples

Return type:

list

similar_items(itemid, N=10)

Calculates a list of similar items

Parameters:
  • itemid (int) – The row id of the item to retrieve similar items for
  • N (int, optional) – The number of similar items to return
Returns:

List of (itemid, score) tuples

Return type:

list

FaissAlternatingLeastSquares

class implicit.approximate_als.FaissAlternatingLeastSquares(approximate_similar_items=True, approximate_recommend=True, nlist=400, nprobe=20, use_gpu=False, random_state=None, *args, **kwargs)

Bases: implicit.als.AlternatingLeastSquares

Speeds up the base AlternatingLeastSquares model by using Faiss to create approximate nearest neighbours indices of the latent factors.

Parameters:
  • nlist (int, optional) – The number of cells to use when building the Faiss index.
  • nprobe (int, optional) – The number of cells to visit to perform a search.
  • use_gpu (bool, optional) – Whether or not to enable run Faiss on the GPU. Requires faiss to have been built with GPU support.
  • approximate_similar_items (bool, optional) – whether or not to build an Faiss index for computing similar_items
  • approximate_recommend (bool, optional) – whether or not to build an Faiss index for the recommend call
  • random_state (int, RandomState or None, optional) – The random state for seeding the initial item and user factors. Default is None.
similar_items_index

Faiss index for looking up similar items in the cosine space formed by the latent item_factors

Type:faiss.IndexIVFFlat
recommend_index

Faiss index for looking up similar items in the inner product space formed by the latent item_factors

Type:faiss.IndexIVFFlat
fit(Ciu, show_progress=True)

Factorizes the item_users matrix.

After calling this method, the members ‘user_factors’ and ‘item_factors’ will be initialized with a latent factor model of the input data.

The item_users matrix does double duty here. It defines which items are liked by which users (P_iu in the original paper), as well as how much confidence we have that the user liked the item (C_iu).

The negative items are implicitly defined: This code assumes that positive items in the item_users matrix means that the user liked the item. The negatives are left unset in this sparse matrix: the library will assume that means Piu = 0 and Ciu = 1 for all these items. Negative items can also be passed with a higher confidence value by passing a negative value, indicating that the user disliked the item.

Parameters:
  • item_users (csr_matrix) – Matrix of confidences for the liked items. This matrix should be a csr_matrix where the rows of the matrix are the item, the columns are the users that liked that item, and the value is the confidence that the user liked the item.
  • show_progress (bool, optional) – Whether to show a progress bar during fitting
recommend(userid, user_items, N=10, filter_already_liked_items=True, filter_items=None, recalculate_user=False)

Recommends items for a user

Calculates the N best recommendations for a user, and returns a list of itemids, score.

Parameters:
  • userid (int) – The userid to calculate recommendations for
  • user_items (csr_matrix) – A sparse matrix of shape (number_users, number_items). This lets us look up the liked items and their weights for the user. This is used to filter out items that have already been liked from the output, and to also potentially calculate the best items for this user.
  • N (int, optional) – The number of results to return
  • filter_already_liked_items (bool, optional) – When true, don’t return items present in the training set that were rated by the specificed user.
  • filter_items (sequence of ints, optional) – List of extra item ids to filter out from the output
  • recalculate_user (bool, optional) – When true, don’t rely on stored user state and instead recalculate from the passed in user_items
Returns:

List of (itemid, score) tuples

Return type:

list

similar_items(itemid, N=10)

Calculates a list of similar items

Parameters:
  • itemid (int) – The row id of the item to retrieve similar items for
  • N (int, optional) – The number of similar items to return
Returns:

List of (itemid, score) tuples

Return type:

list