Clean Room Implementation of Google Page Rank Algorithm
By Angsuman Chakraborty, Gaea News NetworkThursday, August 17, 2006
Finally a clean-room implementation of Google’s Page Rank Algorithm in Java, reverse-engineered from their numerous commentary on Page Rank (or is it Pigeon Rank?).
public static int getPageRank(url) {
// start off with a random low PR
int pageRank = rand.getInt(0, 3);
if ( isHostedOn('google.com', url) ) {
pageRank++;
} else if ( isHostedOn('microsoft.com', url) ) {
pageRank--;
}
// Support valid pages
if (isValidPage(url) ) {
pageRank += 1;
}
tag_value['b'] = 1;
tag_value['h2'] = 2;
tag_value['h1'] = 3;
tag_value['strong'] = -1; // W3C sux!
pageRank = calculateTagsPR(tag_value, pagerank);
// Sergey said good news sites have
// lots of nested tables
tablesOnPage = getTagCount('table');
if (tablesOnPage >= 50) {
pageRank += 2;
}
if (pageRank >= 5) {
pageRank = 4; // helps selling AdWords
}
if (linksFrom('mattcutts.com', url) >= 4) {
// I link to "clean" sites only
// ? Matt, Feb 2006
pagerank += 2;
}
pagerank += countBacklinks(url) / 10000;
blacklist1 = getList('c:\chinese-government-censored.txt');
blacklist2 = getList('c:\larry-page-hatelist.txt');
if ( inArray(blacklist1, url) || inArray(blacklist2, url) ) {
pageRank = 0;
}
d = dashesInUrl(url);
pageRank = (d >= 3) ? pageRank -1 : pageRank + 1;
if (inString(url, "how to build a bomb")) {
// added on request. 2004-12-01.
recipient = "peter@homelandsecurity.gov";
subject = "You might wanna check this...";
sendMailTo(recipient, subject, url);
// page might still be relevant
pageRank++;
}
if (month() == "June" || month() == "October") {
// makes people talk about
// PR updates, good publicity
pagerank -= randomNumber(1,3);
}
if (checkIdenticalPageAndLinkColor) {
// spammer!! Googleaxe it!!
pagerank = 0;
}
if (url == "https://www.nytimes.com") {
// just testing, pls remove tomorrow
// ? Frank, June 2003
pagerank = 10;
}
//Don't show PR above 10
if(pagerank > 10) pagerank = 10;
return pagerank;
}
Modified (to Java and added normalization etc.) from idea and original code by Jack Tang.
Discussion
LLama |
March 19, 2009: 12:42 pm
Any string comparison done in Java with == will make the expression false (if you use Strings defined in your sourcecode) |
YOUR VIEW POINT
chaitu