LogisticMatrixFactorization

class implicit.lmf.LogisticMatrixFactorization

Logistic Matrix Factorization

A collaborative filtering recommender model that learns probabilistic distribution whether user like it or not. Algorithm of the model is described in Logistic Matrix Factorization for Implicit Feedback Data <https://web.stanford.edu/~rezab/nips2014workshop/submits/logmat.pdf>

Parameters:
  • factors (int, optional) – The number of latent factors to compute
  • learning_rate (float, optional) – The learning rate to apply for updates during training
  • regularization (float, optional) – The regularization factor to use
  • dtype (data-type, optional) – Specifies whether to generate 64 bit or 32 bit floating point factors
  • iterations (int, optional) – The number of training epochs to use when fitting the data
  • neg_prop (int, optional) – The proportion of negative samples. i.e.) “neg_prop = 30” means if user have seen 5 items, then 5 * 30 = 150 negative samples are used for training.
  • use_gpu (bool, optional) – Fit on the GPU if available
  • num_threads (int, optional) – The number of threads to use for fitting the model. This only applies for the native extensions. Specifying 0 means to default to the number of cores on the machine.
  • random_state (int, RandomState or None, optional) – The random state for seeding the initial item and user factors. Default is None.
item_factors

Array of latent factors for each item in the training set

Type:ndarray
user_factors

Array of latent factors for each user in the training set

Type:ndarray
fit()

Factorizes the item_users matrix

Parameters:
  • item_users (coo_matrix) – Matrix of confidences for the liked items. This matrix should be a coo_matrix where the rows of the matrix are the item, and the columns are the users that liked that item. BPR ignores the weight value of the matrix right now - it treats non zero entries as a binary signal that the user liked the item.
  • show_progress (bool, optional) – Whether to show a progress bar
rank_items()

Rank given items for a user and returns sorted item list.

Parameters:
  • userid (int) – The userid to calculate recommendations for
  • user_items (csr_matrix) – A sparse matrix of shape (number_users, number_items). This lets us look up the liked items and their weights for the user. This is used to filter out items that have already been liked from the output, and to also potentially calculate the best items for this user.
  • selected_items (List of itemids) –
  • recalculate_user (bool, optional) – When true, don’t rely on stored user state and instead recalculate from the passed in user_items
Returns:

List of (itemid, score) tuples. it only contains items that appears in input parameter selected_items

Return type:

list

recommend()

Recommends items for a user

Calculates the N best recommendations for a user, and returns a list of itemids, score.

Parameters:
  • userid (int) – The userid to calculate recommendations for
  • user_items (csr_matrix) – A sparse matrix of shape (number_users, number_items). This lets us look up the liked items and their weights for the user. This is used to filter out items that have already been liked from the output, and to also potentially calculate the best items for this user.
  • N (int, optional) – The number of results to return
  • filter_already_liked_items (bool, optional) – When true, don’t return items present in the training set that were rated by the specificed user.
  • filter_items (sequence of ints, optional) – List of extra item ids to filter out from the output
  • recalculate_user (bool, optional) – When true, don’t rely on stored user state and instead recalculate from the passed in user_items
Returns:

List of (itemid, score) tuples

Return type:

list

recommend_all()

Recommends items for all users

Calculates the N best recommendations for all users, and returns numpy ndarray of shape (number_users, N) with item’s ids in reversed probability order

Parameters:
  • self (implicit.als.AlternatingLeastSquares) – The fitted recommendation model
  • user_items (csr_matrix) – A sparse matrix of shape (number_users, number_items). This lets us look up the liked items and their weights for the user. This is used to filter out items that have already been liked from the output, and to also potentially calculate the best items for this user.
  • N (int, optional) – The number of results to return
  • recalculate_user (bool, optional) – When true, don’t rely on stored user state and instead recalculate from the passed in user_items
  • filter_already_liked_items (bool, optional) – This is used to filter out items that have already been liked from the user_items
  • filter_items (list, optional) – List of item id’s to exclude from recommendations for all users
  • num_threads (int, optional) – The number of threads to use for sorting scores in parallel by users. Default is number of cores on machine
  • show_progress (bool, optional) – Whether to show a progress bar
  • batch_size (int, optional) – To optimise memory usage while matrix multiplication, users are separated into groups and scored iteratively. By default batch_size == num_threads * 100
  • users_items_offset (int, optional) – Allow to pass a slice of user_items matrix to split calculations
Returns:

Array of (number_users, N) with item’s ids in descending probability order

Return type:

numpy ndarray

similar_items()

Calculates a list of similar items

Parameters:
  • itemid (int) – The row id of the item to retrieve similar items for
  • N (int, optional) – The number of similar items to return
Returns:

List of (itemid, score) tuples

Return type:

list

similar_users()

Calculates a list of similar users

Parameters:
  • userid (int) – The row id of the user to retrieve similar users for
  • N (int, optional) – The number of similar users to return
Returns:

List of (userid, score) tuples

Return type:

list