Latent Semantic Analysis

Collaborative Filtering using backpropagation

Collaborative Filtering is a method of extracting features that explain users personal preferences based on previous actions of that user, as well as every other user in the network. For example, when we want to find movies a person likes, based on the ratings of other people in the network. The most common way of feature extraction is by using Matrix Factorisation.

The goal is to find lower dimensional matrices P and Q such that the PxQ (Matrix Multiplication) results in a higher dimensional R. :

[p11p12p21p22p31p32]ā‹…[q11q12q13q21q22q23]=[r11r12r13r21r22r23r31r32r33]\begin{bmatrix} p_{11} & p_{12} \\ p_{21} & p_{22} \\ p_{31} & p_{32} \end{bmatrix} \cdot \begin{bmatrix} q_{11} & q_{12} & q_{13} \\ q_{21} & q_{22} & q_{23} \end{bmatrix} = \begin{bmatrix} r_{11} & r_{12} & r_{13} \\ r_{21} & r_{22} & r_{23} \\ r_{31} & r_{32} & r_{33} \end{bmatrix}

This means that the matrix R, which represents direct ratings from all peer to all movies is reduced to 2 lower dimensional matrices, such that P represents the peoples preferences of movie genres, and Q represents how much a movie belongs to a particular genre (e.g. A movie can be an action movie that contains some elements of comedy)

In the real world the R matrix is sparse, meaning that people are not rating all the movies in the existence, instead that action happens rarely, and the point is to learn these representation from limited data.

The learning of feature matrices is achieved through backpropagation, by using gradient descent to iteratively update values of matrices P and Q until the desired error is achieved.

Calculating the prediction for one movie goes as following:

r11=p11āˆ—q11+p12āˆ—q21r_{11} = p_{11}*q_{11} + p_{12}*q_{21}

After we do this for all the elements, the error function is calculated as following:

E=āˆ‘i(Riāˆ’ri)2E = \sum_{i}{(R_i - r_i)^2}

Where:

  • RiR_i = True result taken from R matrix

  • rir_i = Result returned from a model

  • EE = Total error for the whole network

Based on this error, the parameters (edges) are updated by taking the derivative of update function and changing the values of matrix elements such that it goes in the direction of reducing the error.

When the error is low enough, we consider the model to be fit. Based on the resulting P and Q matrices, we can infer/predict the missing ratings from the matrix R. That way we can recommend a user the movies that they haven't seen before.

Verifiability of Matrix Factorisation method, including wider class of GNNs is the next stage of evolution of OpenRank.

Last updated