Hubs and Authorities

An alternative reputation algorithm for bipartite graphs

Hubs and Authorities is a reputation algorithm that is similar to EigenTrust, but works for bipartite graphs (for example, a graph of people who like books; the edges are between people and the books they like, but not between people or between books). The algorithm defines a good book as a book that is liked by good readers. And it defines a good reader, recursively, as somebody who likes good books. Computing the resulting rankings involves a 2-step iterative process that computes the left and right principal singular vectors. Like EigenTrust, each iteration involves a matrix-vector multiplication, and the computation eventually converges.

It consists of 2 kinds of participants:

Hubs - nodes that don’t contain any crucial information in the graph, and their job is mostly be a central place that contains information about other nodes, e.g. to point to other nodes in the network (keep a list of nodes that do contain useful information)
Authorities - nodes that contain useful information

Nodes can take a role of both of these (can be both a Hub and Authority) or it can take a single role. The calculation of scores is done separately for these participants, but they are calculated based on the scores of opposing role:

Scores for hubs is calculated from outgoing scores to authorities
Scores of authorities are calculated with incoming scores of hubs

Unlike EigenTrust, the original weights (scores) of nodes (both hubs and authorities) are initialised to 1, meaning that they all have equal importance in the beginning. At later iterations, the importance of authorities is derived from the amount of "votes" received by the hubs. Then, the importance of hubs is derived from how many important authorities they voted for.

The scores of both hubs and authorities are normalised after each "voting process", i.e. after each iteration. One of the reasons is to prevent the scores from exploding into infinity.

The edges (connections) between nodes are described with the Adjacency Matrix $A$ , and the update functions are defined as:

h = A*a \\ a = A^T*h

The most common normalisation function used is:

h_i^{(t+1)} = \frac{h_i^{(t+1)}}{\sqrt{\sum_i{(h_i^{(t+1)})^2}}}

The resulting scores, in our example case, should tell us who are good readers, and what are good books, which is mostly derived from the idea that good books are read from many readers, and good readers are the ones who read good books.

However, unlike EigenTrust, H&A is easier to be taken advantage of by sybil actors, who can easily represent themselves as good readers by simply saying that they like books that are considered good by other peers, which is why further optimisations need to be made in order to use H&A at scale.

For further detail on how H&A computation can be verified, refer to the OpenRank paper.

PreviousEigenTrust NextLatent Semantic Analysis

Last updated 1 year ago