Recommender systems recognize patterns of user preference in
products and provide personalized recommendations for users and are used in a
variety of online shopping websites. Content filtering approach and collaborative
filtering are two main methods used in recommendation systems. In former
method, the characteristics of users and products are stored but the latter,
needs the user behavior in the past. Collaborative filtering is more accurate
than content based methods but doesn’t work very accurate for new products and
users. Collaborative filtering approach is based on neighborhood methods and
latent factor models. Neighborhood methods finds the relation between both
users and items. In other words, the interest of a user to and item, could be
based on the rating of the similar items. But in latent factor models, both
items and users are characterized. Matrix factorization methods that are
concerned in this paper are the base of latent factor models which characterize
users and items by vectors. More interaction between user and item factors
recommends items for the user. In these systems usually input data are derived
from explicit feedback and marked on the dimensions of a matrix which are users
and items. The matrix is likely sparse but there are methods that can overcome
this issue.
Matrix factorization models are modeled as inner product of
users and items on a joint latent factor space. Due to sparsity of the matrix,
this method has difficulties. There are methods that can dominate this problem
and add data to the matrix but are expensive and create massive amount of data.
Therefore, methods using observed data are recommended.
The factor vectors of users and items are learned by Minimizing
the regularized square error on the set of known ratings with stochastic
gradient descent and alternating least squares which are learning algorithms. The
extent of the regularization is determined by cross-validation.
Various data aspects and other application-specific requirements
can be considered in matrix factorization approach. As we know, the variation in
ratings is because of effects associated with either users or items. This variation
is called biases or intercepts and are independent of interactions. To consider
the biases, observed rating is broken down into global average, item bias, user
bias and user-item interaction. The square error function should be minimized
in order to make the system learn.
When there isn’t enough rating from the users, we can
incorporate additional sources of information like implicit feedback or user
attributes. The matrix factorization model integrates all signal sources so
items can get a similar treatment when necessary.
In reality product perceptions, popularity and customer’s
inclination evolve over time. This temporal effects should be considered in the
system. The matrix factorization approach assumes these terms vary over time:
item biases, user biases and user preferences. Temporal dynamics also affect
the interaction between users and items.
In another situation, the ratings don’t have same weight or
confidence. The matrix factorization model can easily accept varying confidence
levels by giving less weight to less meaningful observations.
The more complex factor models with different amount of
details, are more accurate. Since there are temporal effects in the data, the
temporal components are very important.
The experiments in this paper are done on Netflix dataset. The
results are superior to classical nearest-neighbor techniques and their model
is memory-efficient. Also many crucial aspects of the data in forms of
feedback, temporal dynamics and confidence levels are integrated to make the
model accurate.
No comments:
Post a Comment