AlternatingLeastSquares¶

class
implicit.als.
AlternatingLeastSquares
(factors=100, regularization=0.01, dtype=<type 'numpy.float32'>, use_native=True, use_cg=True, use_gpu=False, iterations=15, calculate_training_loss=False, num_threads=0, random_state=None)¶ Alternating Least Squares
A Recommendation Model based off the algorithms described in the paper ‘Collaborative Filtering for Implicit Feedback Datasets’ with performance optimizations described in ‘Applications of the Conjugate Gradient Method for Implicit Feedback Collaborative Filtering.’
Parameters:  factors (int, optional) – The number of latent factors to compute
 regularization (float, optional) – The regularization factor to use
 dtype (datatype, optional) – Specifies whether to generate 64 bit or 32 bit floating point factors
 use_native (bool, optional) – Use native extensions to speed up model fitting
 use_cg (bool, optional) – Use a faster Conjugate Gradient solver to calculate factors
 use_gpu (bool, optional) – Fit on the GPU if available, default is to run on GPU only if available
 iterations (int, optional) – The number of ALS iterations to use when fitting data
 calculate_training_loss (bool, optional) – Whether to log out the training loss at each iteration
 num_threads (int, optional) – The number of threads to use for fitting the model. This only applies for the native extensions. Specifying 0 means to default to the number of cores on the machine.
 random_state (int, RandomState or None, optional) – The random state for seeding the initial item and user factors. Default is None.

item_factors
¶ Array of latent factors for each item in the training set
Type: ndarray

user_factors
¶ Array of latent factors for each user in the training set
Type: ndarray

explain
(userid, user_items, itemid, user_weights=None, N=10)¶ Provides explanations for why the item is liked by the user.
Parameters:  userid (int) – The userid to explain recommendations for
 user_items (csr_matrix) – Sparse matrix containing the liked items for the user
 itemid (int) – The itemid to explain recommendations for
 user_weights (ndarray, optional) – Precomputed Cholesky decomposition of the weighted user liked items. Useful for speeding up repeated calls to this function, this value is returned
 N (int, optional) – The number of liked items to show the contribution for
Returns:  total_score (float) – The total predicted score for this user/item pair
 top_contributions (list) – A list of the top N (itemid, score) contributions for this user/item pair
 user_weights (ndarray) – A factorized representation of the user. Passing this in to future ‘explain’ calls will lead to noticeable speedups

fit
(item_users, show_progress=True)¶ Factorizes the item_users matrix.
After calling this method, the members ‘user_factors’ and ‘item_factors’ will be initialized with a latent factor model of the input data.
The item_users matrix does double duty here. It defines which items are liked by which users (P_iu in the original paper), as well as how much confidence we have that the user liked the item (C_iu).
The negative items are implicitly defined: This code assumes that positive items in the item_users matrix means that the user liked the item. The negatives are left unset in this sparse matrix: the library will assume that means Piu = 0 and Ciu = 1 for all these items. Negative items can also be passed with a higher confidence value by passing a negative value, indicating that the user disliked the item.
Parameters:  item_users (csr_matrix) – Matrix of confidences for the liked items. This matrix should be a csr_matrix where the rows of the matrix are the item, the columns are the users that liked that item, and the value is the confidence that the user liked the item.
 show_progress (bool, optional) – Whether to show a progress bar during fitting

rank_items
()¶ Rank given items for a user and returns sorted item list.
Parameters:  userid (int) – The userid to calculate recommendations for
 user_items (csr_matrix) – A sparse matrix of shape (number_users, number_items). This lets us look up the liked items and their weights for the user. This is used to filter out items that have already been liked from the output, and to also potentially calculate the best items for this user.
 selected_items (List of itemids) –
 recalculate_user (bool, optional) – When true, don’t rely on stored user state and instead recalculate from the passed in user_items
Returns: List of (itemid, score) tuples. it only contains items that appears in input parameter selected_items
Return type: list

recommend
()¶ Recommends items for a user
Calculates the N best recommendations for a user, and returns a list of itemids, score.
Parameters:  userid (int) – The userid to calculate recommendations for
 user_items (csr_matrix) – A sparse matrix of shape (number_users, number_items). This lets us look up the liked items and their weights for the user. This is used to filter out items that have already been liked from the output, and to also potentially calculate the best items for this user.
 N (int, optional) – The number of results to return
 filter_already_liked_items (bool, optional) – When true, don’t return items present in the training set that were rated by the specificed user.
 filter_items (sequence of ints, optional) – List of extra item ids to filter out from the output
 recalculate_user (bool, optional) – When true, don’t rely on stored user state and instead recalculate from the passed in user_items
Returns: List of (itemid, score) tuples
Return type: list

recommend_all
()¶ Recommends items for all users
Calculates the N best recommendations for all users, and returns numpy ndarray of shape (number_users, N) with item’s ids in reversed probability order
Parameters:  self (implicit.als.AlternatingLeastSquares) – The fitted recommendation model
 user_items (csr_matrix) – A sparse matrix of shape (number_users, number_items). This lets us look up the liked items and their weights for the user. This is used to filter out items that have already been liked from the output, and to also potentially calculate the best items for this user.
 N (int, optional) – The number of results to return
 recalculate_user (bool, optional) – When true, don’t rely on stored user state and instead recalculate from the passed in user_items
 filter_already_liked_items (bool, optional) – This is used to filter out items that have already been liked from the user_items
 filter_items (list, optional) – List of item id’s to exclude from recommendations for all users
 num_threads (int, optional) – The number of threads to use for sorting scores in parallel by users. Default is number of cores on machine
 show_progress (bool, optional) – Whether to show a progress bar
 batch_size (int, optional) – To optimise memory usage while matrix multiplication, users are separated into groups and scored iteratively. By default batch_size == num_threads * 100
 users_items_offset (int, optional) – Allow to pass a slice of user_items matrix to split calculations
Returns: Array of (number_users, N) with item’s ids in descending probability order
Return type: numpy ndarray

similar_items
()¶ Calculates a list of similar items
Parameters:  itemid (int) – The row id of the item to retrieve similar items for
 N (int, optional) – The number of similar items to return
Returns: List of (itemid, score) tuples
Return type: list

similar_users
()¶ Calculates a list of similar users
Parameters:  userid (int) – The row id of the user to retrieve similar users for
 N (int, optional) – The number of similar users to return
Returns: List of (userid, score) tuples
Return type: list