| (1) |
Monday, December 13, 2010
Other applications of the ideas behind Page Rank
One of the problem a search engine has to solve is to sort the
potentially very long list of matches for a given query with a
sensible ordering. Sensible here means that ideally the more
interesting results come first.
Many strategies could be applied here (and there is a lot of
ongoing research on the subject). Many complex strategies are
already applied. However, the basic idea [4]
Google used was very simple (as an idea) and very robust.
It is easy to see that the idea was simple. Let r(Pi) be the
ranking of page Pi, Li be the set of pages linking to Pi,
and |Pi| be the number of outbound links of |Pi| then:
Simple, indeed. With the minor problem that we have no starting point
to trigger the computation. Luckily enough, iterative methods come to
help. More on this in a few paragraph. I have not yet clarified the
"robust" part.
In fact the kinds of ideas underlying the PageRank are rather old.
Similar ideas have been spotted in bibliometrics, econometrics and
sociometrics. This is not to be considered a defect (lack of novelty),
rather I consider it a feature as in "already field-tested".
Using words, PageRank states that a page is important if many
important pages link to it. Which is not different from the usual
definition of importance of journals, where a journal is important
if many important journals contain articles referring to articles
in that journal. Well, in the journal world things are easier as there
are far less journals than web pages and there is a clear temporal order
that can be used. Moreover, articles usually cite only papers written
before them (at least until some research has a paper describing his
implementation of time-machines, then he could easily refer to articles
not yet published).
In 1949 Seeley proposed to "measure" the popularity of a child
(he was researching relationships in children social group) considering
the popularity of the children which enlisted him among his friends.
That is, once again, something like Pi = f(F11,...,F1k), where
F1j is a child who listed 1 among his friends. The very same
criterion can be applied to recommendations.
Leontief won Nobel Prize for Economy for his ideas on econometrics on
the input-output model, which is yet an application of the same underlying
ideas. Here I won't give further details. These example come from
[1] where further details are given.
I explore further how to compute the PageRank here.
Etichette:
Algorithms,
Complex Networks
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment