PageRank is at the heart of Google’s calculations. Now, Stanford University researchers have developed a technique to speed up the calcuations.
To speed up PageRank, the Stanford team developed a trio of techniques in numerical linear algebra. First, in the WWW2003 paper, they describe so-called “extrapolation” methods, which make some assumptions about the Web’s link structure that aren’t true, but permit a quick and easy computation of PageRank. Because the assumptions aren’t true, the PageRank isn’t exactly correct, but it’s close and can be refined using the original PageRank algorithm. The Stanford researchers have shown that their extrapolation techniques can speed up PageRank by 50 percent in realistic conditions and by up to 300 percent under less realistic conditions.
A second paper describes an enhancement, called “BlockRank,” which relies on a feature of the Web’s link structure–a feature that the Stanford team is among the first to investigate and exploit. Namely, they show that approximately 80 percent of the pages on any given Web site point to other pages on the same site. As a result, they can compute many single-site PageRanks, glue them together in an appropriate manner and use that as a starting point for the original PageRank algorithm. With this technique, they can realistically speed up the PageRank computation by 300 percent.
Finally, the team notes in a third paper that the rankings for some pages are calculated early in the PageRank process, while the rankings of many highly rated pages take much longer to compute. In a method called “Adaptive PageRank,” they eliminate redundant computations associated with those pages whose PageRanks finish early. This speeds up the PageRank computation by up to 50 percent.
“Further speed-ups are possible when we use all these methods,” Kamvar said. “Our preliminary experiments show that combining the methods will make the computation of PageRank up to a factor of five faster. However, there are still several issues to be solved. We’re closer to a topic-based PageRank than to a personalized ranking.”
Also, thanks to Rahul Dave, for having told me about these ideas a little while earlier.
This could be interesting for BlogStreet, as we look at creating searches within a neighbourhood. It is something I talk about in the Memex series.