# Latent Semantic Analysis

Collaborative Filtering is a method of extracting features that explain users personal preferences based on previous actions of that user, as well as every other user in the network. For example, when we want to find movies a person likes, based on the ratings of other people in the network. The most common way of feature extraction is by using Matrix Factorisation.

The goal is to find lower dimensional matrices P and Q such that the PxQ (Matrix Multiplication) results in a higher dimensional R. :

$$
\begin{bmatrix}
p\_{11} & p\_{12}  \\
p\_{21} & p\_{22} \\
p\_{31} & p\_{32}
\end{bmatrix}
\cdot
\begin{bmatrix}
q\_{11} & q\_{12} & q\_{13} \\
q\_{21} & q\_{22} & q\_{23}
\end{bmatrix}
=============

\begin{bmatrix}
r\_{11} & r\_{12} & r\_{13} \\
r\_{21} & r\_{22} & r\_{23} \\
r\_{31} & r\_{32} & r\_{33}
\end{bmatrix}
$$

This means that the matrix R, which represents direct ratings from all peer to all movies is reduced to 2 lower dimensional matrices, such that P represents the peoples preferences of movie genres, and Q represents how much a movie belongs to a particular genre (e.g. A movie can be an action movie that contains some elements of comedy)

In the real world the R matrix is sparse, meaning that people are not rating all the movies in the existence, instead that action happens rarely, and the point is to learn these representation from limited data.

The learning of feature matrices is achieved through backpropagation, by using gradient descent to iteratively update values of matrices P and Q until the desired error is achieved.

Calculating the prediction for one movie goes as following:

$$
r\_{11} = p\_{11}\*q\_{11} + p\_{12}\*q\_{21}
$$

After we do this for all the elements, the error function is calculated as following:

$$
E = \sum\_{i}{(R\_i - r\_i)^2}
$$

Where:

* $$R\_i$$ = True result taken from R matrix
* $$r\_i$$ = Result returned from a model
* $$E$$ = Total error for the whole network

Based on this error, the parameters (edges) are updated by taking the derivative of update function and changing the values of matrix elements such that it goes in the direction of reducing the error.

When the error is low enough, we consider the model to be fit.\
Based on the resulting P and Q matrices, we can infer/predict the missing ratings from the matrix R. That way we can recommend a user the movies that they haven't seen before.

Verifiability of Matrix Factorisation method, including wider class of GNNs is the next stage of evolution of OpenRank.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.openrank.com/reputation-algorithms/latent-semantic-analysis.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
