PDA

View Full Version : 37 PR10 Pages Drop to PR9's


bobmutch
10-07-2004, 05:37 PM
37 PR10 Pages Drop to PR9's
When there is a toolbar PR update lots of things change. The Search Engine forums discuss these changes and post their findings and theories and the SEO community lookings over all the changes and theorize why the different things have happened.
I keep a list of all the PR 10 Pages (http://www.seocompany.ca/pagerank/pr-10-pages.php) and Page Rank 10 Sites (http://www.seocompany.ca/pagerank/page-rank-10-sites.php) on the internet. This Oct 6th PR update saw 37 out of 152 PR10 pages drop to PR9's. This post takes a look at one of the theories why this has happened.

The following are the PR10s that moved to PR9s for each company. Adobe 4/51 pages, Apple 7/27 pages, Google 7/35 pages, MicroSoft 5/6 pages, w3.org 2/4 and nsf.org 1/2 pages. Beside these pages there were 11/28 sites that had their only PR10 page drop to a PR9.

One theory why this has happened is that real PageRank has a maximum value of the number of pages in Google's index. As the index gets bigger the real PageRank number increases.


Markus Sobek in his article A Survey of Google’s PageRank (http://www.miswebdesign.com/resources/articles/pagerank-3.html) theorizes that the real PageRank has a maximum value of dN=(1-d) where N is the total number of web pages in the index (d is usually set to 0.85), and that a real PageRank is scaled to be displayed on the Google toolbar. It is generally accepted that this scalation is logarithmically.

The main reason it is assumed that the toolbar scale is logarithmic is as you go up the scale it takes many more links to get a PR4 or PR5 then it takes to get a PR3. The small number of pages that reach PR10 shows this also.


The main reason it is assumed that the toolbar scale is logarithmic is as you go up the scale it takes many more links to get a PR3 then it takes to get a PR4 or PR5. Also the small number of pages that reach PR10 shows this also.

When the real PageRank numbers increase, the range of each of the toolbar scale units moves up. When this happens the lower range of real PageRank in each toolbar unit (except the toolbar PR1 as it covers the very bottom of the real PageRank numbers) is then covered by the toolbar PR value below. Some hold that this is why 37 PR10's slipped to PR9's in this toolbar update. (The below graphic is not to scale but gives you a visual of the theory of what has happened.)

http://www.seocompany.ca/images/new-old-scale.gif

Research is still being done to find all the new PR10 pages and new sites that have picked up PR10 pages. I will be doing a post my findings some time next week.

randfish
10-07-2004, 07:17 PM
Excellent information bob - I was shocked to see that yahoo.com dropped to a 9/10. It would be great to see how this information affects sites in the middle range (4,5,6) as well.

Marcia
10-07-2004, 08:03 PM
>>shocked to see that yahoo.com dropped to a 9/10

Not too shocking. ;)

>> SEO community lookings over all the changes and theorize why the different things have happened.

This is about the least perplexing of all of them so far, thanks to those who explained it early on. I recall that there's been a "universal" downshifting of PR across the board that has happened periodically.

It's a bit harder for people to spot now than it used to be, with the lag time and TBPR mostly off (and obscured). But in the past, it was characterized by a massive outcry by webmasters that their PR dropped for no reason - and it was for no reason pertaining to their particular site.

Aside from that, right now there's some "highly educated" speculation that Google has actually changed the way the PR is being calculated. But so far there aren't findings and detailed data revealed, and there hasn't yet been anything published on it. At least not by one individual who's currently seriously researching it.

Mel
10-07-2004, 11:34 PM
...

One theory why this has happened is that real PageRank has a maximum value of the number of pages in Google's index. As the index gets bigger the real PageRank number increases.

Markus Sobek in his article A Survey of Google’s PageRank (http://www.miswebdesign.com/resources/articles/pagerank-3.html) theorizes that the real PageRank has a maximum value of dN=(1-d) where N is the total number of web pages in the index (d is usually set to 0.85), and that a real PageRank is a manual scalation that follows a logarithmical scheme.

When the real PageRank numbers increase, the range of each of the toolbar scale units moves up. When this happens the lower range of real PageRank in each toolbar unit (except the toolbar PR1 as it covers the very bottom of the real PageRank numbers) is then covered by the toolbar PR value below. Some hold that this is why 37 PR10's slipped to PR9's in this toolbar update. (The below graphic is not to scale but gives you a visual of the theory of what has happened.)

http://www.seocompany.ca/images/new-old-scale.gif

...
An interesting theory, but not quite spot on. Google has stated early on that the average value of all PR in its index is one.

While a quick shot in the dark will conclude that therefore the maximum PR value must be equal to the number of pages in the index, that scenario is just not possible.

If there were one page with a true PR equal to the number of the pages in the index, then all the other pages would have to have a PR0 in order for the average to remain at one.We know that is not possible, so the top of the PR scale actually in use is far far below that figure, the average PR still remains at one, but the uppermost value may well increase as pages are added to the index.

This may shift the boundaries of the graphic toolbar PR boxes somewhat, but IMO the first area to look at if you want to see why some pages PR dropped is to compare the backlinks reported before the update with those after. If you see a drop in the number of links reported, IMO you do not have to look further.

The exaggeration of scale on your drawing may lead some to draw wrong conclusions. Do you have anything to support the idea that the width of the PR10 selection box changed relative to the others? Or that the width of each box is not the same?

As to True PR being "a manual scalation that follows a logarithmical scheme." that is simply not factual, unless you are not talking about True PR but about toolbar PR. Graph some calculations using the PR formula and you will soon see that they are not logarithmic.

Mel
10-07-2004, 11:40 PM
...

Aside from that, right now there's some "highly educated" speculation that Google has actually changed the way the PR is being calculated. But so far there aren't findings and detailed data revealed, and there hasn't yet been anything published on it. At least not by one individual who's currently seriously researching it.

Anything is possible, but it might do to remember that PageRank is not Googles to do with as they please, it is the patented property of Stanford University, and is leased to Google on an exclusive basis until a fixed date. If Google have changed the basic way of calculating PageRank, can they still call it PageRank?

bobmutch
10-08-2004, 12:55 PM
Mel:If there were one page with a true PR equal to the number of the pages in the index, then all the other pages would have to have a PR0 in order for the average to remain at one. While I cheerfully agree that it is impossible to have all 5 or 6 billion pages point to one page, I don't see the relationship between the two. I see no reason that Google can't have a maximum value that can't be reached.so the top of the PR scale actually in use is far far below that figure I would agree with that.but IMO the first area to look at if you want to see why some pages PR dropped is to compare the backlinks reported before the update with those after There is no reliable way to do this. Google is not reporting all backlinks.The exaggeration of scale on your drawing may lead some to draw wrong conclusions. The graphic notes that it is not to scale.Do you have anything to support the idea that the width of the PR10 selection box changed relative to the others? Or that the width of each box is not the same? Yes it is generally accepted that the toolbar scale is logarthimic.As to True PR being "a manual scalation that follows a logarithmical scheme." that is simply not factual, unless you are not talking about True PR but about toolbar PR. That was not worded very clearly, I have changed it.Graph some calculations using the PR formula and you will soon see that they are not logarithmic. The PR forumula is real PR and that is not logarthimic. It is the toolbar displayed that I have stated a number of times to you in our discussions that is logarithmic. It is the toolbar displayed PR scale that is logarithmic.

Mel
10-08-2004, 11:40 PM
Mel: While I cheerfully agree that it is impossible to have all 5 or 6 billion pages point to one page, I don't see the relationship between the two.

Sorry for your confusion Bob:
If the average value of all PR across the web is one as the PR papers tell us, and we assume there are perhaps 6 billion pages in the Google index, then if one page has a PR of 6 billion, the average of all the other 5,999,999,999 pages has to be zero in order that the average remains one.


I see no reason that Google can't have a maximum value that can't be reached.

It is not an arbitrary value picked by someone, it is a figure calculated according to precise formula which is well known and thus it is impossible that the maximum figure cannot be reached, as by definition one page at least has to have reached that figure.

There is no reliable way to do this. Google is not reporting all backlinks.

Yes, but you can get a fair approximation by using Yahoos reported backlinks, as they are indexing for the most part the same htmlpages (which excludes things like .pdf, .swf and images) and have roughly the same size index. It may not be precise but it can be indicative to the degree that you can see some trends.

bobmutch
10-09-2004, 09:21 AM
Sorry for your confusion Bob:
If the average value of all PR across the web is one as the PR papers tell us, and we assume there are perhaps 6 billion pages in the Google index, then if one page has a PR of 6 billion, the average of all the other 5,999,999,999 pages has to be zero in order that the average remains one. There is no confusion on my side Mel so no reason to be sorry for something that is not there. It is impossible that one page have a real PR of 6 billion. You can't link 6 billion pages to one page. So that is a mote point and means nothing. It is impossible. The premise is impossible and your conculation not only makes no sense, but is based on a impossible premise. You have yet to show why the real PR range can't be from 0.15 to N. (N = total number of pages in the index.) It is not an arbitrary value picked by someone, it is a figure calculated according to precise formula which is well known and thus it is impossible that the maximum figure cannot be reached, as by definition one page at least has to have reached that figure.I was not discussing the PR algo. The PR algo is for calculating the voted PR from one page to another.Yes, but you can get a fair approximation by using Yahoos reported backlinks, as they are indexing for the most part the same htmlpages (which excludes things like .pdf, .swf and images) and have roughly the same size index. It may not be precise but it can be indicative to the degree that you can see some trends. While Yahhoo doesn't away show the right number I would agree with to some degree.

Are you telling me that you have the Yahoo [pre-June 22] link command numbers and that you compared them with the post-0ct 6th Yahoo link command numbers of the pages that dropped from RP10 to PR9 and you found them to have less links over all. Where did you get your link command numbers for [pre-June 22].

Please present the numbers and your source. And make sure you are not comparing linkdomain numbers with link numbers. If you have those number of the PR10's pages that didn't change I would like to see them also. I don't have access to the [pre-June 22] numbers so I can't check that. I did look at the Google link numbers, which really mean nothing as Google is not reporting all numbers, and found the following.
Movement: 84.35% Same; 10.43% Down; 5.22% Up; That is for my list of 115 PR10's that didn't change.
I would like you to post you source of [pre-June 22] numbers for Yahoo "link" command so I can check this out.

Mel
10-09-2004, 12:38 PM
I was not discussing the PR algo. The PR algo is for calculating the voted PR from one page to another

LOL Bob, how do pages get PR if not by the PR algo???

There is no confusion on my side Mel so no reason to be sorry for something that is not there. It is impossible that one page have a real PR of 6 billion. You can't link 6 billion pages to one page. So that is a mote point and means nothing. It is impossible. The premise is impossible and your conculation not only makes no sense, but is based on a impossible premise. You have yet to show why the real PR range can't be from 0.15 to N. (N = total number of pages in the index.)

LOL from your quoted post above:
It is impossible that one page have a real PR of 6 billion... (that was assuming that there were 6 billion pages in the index)
You have yet to show why the real PR range can't be from 0.15 to N. (N = total number of pages in the index.)

Don't you see the problem there? If the highest ranked page cannot have a PR equal to the number of pages in the index, then how can the PR range from 0.15 to the number of pages in the index???

bobmutch
10-09-2004, 12:44 PM
Mel: Don't you see the problem there? If the highest ranked page cannot have a PR equal to the number of pages in the index, then how can the PR range from 0.15 to the number of pages in the index???No, I don't see a problem with that at all. I see no problem with there being in the real PR a top range that is never reached. That would mean that the toolbar PR10 range would be from N-x to N, with x being the range of of the real PR that the toolbar PR10 represents, and that no PR10 would reach the top part of that range. I see no problem with that at all.

Mel
10-09-2004, 01:36 PM
The PR numbers aren't picked out of a box, or determined by Larry Page, they are generated by the PR algo which only takes into account the pages which are in the Google index.

bobmutch
10-09-2004, 03:19 PM
Mel: Sorry Mel we are not going anywhere here. This is turning more into a game than a profitable discussion. I don't think this discussion with you is going anywhere nor do I think it is profitable. In light of that I am not interested in carring this conversation any further with you on a public form. If you feel a need to discuss it further I believe it will be in the best interest of this forum to go PM.

bobmutch
10-09-2004, 05:17 PM
Everyone: Just some more stats on the PR10 pages that dropped to PR9's and the PR10 pages that didn't drop.
My Yahoo numbers where incomplete and the ones I did have were obtained using linkdomain vs link which give you the links for the whole domain vs a page. So I don't have Yahoo stats, but I will report on the Yahoo stats on the next toolbar PR update in Jan 2005.
Goggle Movement: 84.35% Same; 10.43% Down; 5.22% Up; This is for my list of 115 PR10's that didn't change.
Google Movement: 9.10% Same; 45.45% Down; 45.45% Up; This is for my list of 11 PR10 Sites that did change.

At the request of one of the mod's I have put links to my PR 10 Pages and my PR 10 Sites in the frist post of this thread. So if you want to take a peek at the PR 10 Pages and the Page Rank 10 Sites lists got to the top of this thread and the links are there.

Mel
10-09-2004, 10:26 PM
Mel: Sorry Mel we are not going anywhere here. This is turning more into a game than a profitable discussion. I don't think this discussion with you is going anywhere nor do I think it is profitable. In light of that I am not interested in carring this conversation any further on a public form. If you feel a need to discuss it further I believe it will be in the best interest of this forum to go PM.

LOL Bob, if you think there is nothing profitable in insuring that unsubstantiated information is not left to stand without questioningit, then there is no purpose to forums.

I have no need to discuss it further, you are welcome to whatever ideas you have regarding the topic, so long as you don't try to dissemenate them as facts, inwhich case I will comment on it.

NFFC
10-11-2004, 04:51 PM
Sometimes you trip over a post and kinda wish you hadn't!

Personally I think bobmutch makes some good points.

For those of us with a simple mind, not you of course Mel, I find this model of PageRank both easier to understand and from an SEO point of view more rewarding.

Take any user, stick them in front of a computer and let them click links in whatever random way they choose. The possibilty that they will end up at any of your pages/sites is your real PR score.

>One theory why this has happened is that real PageRank has a maximum value of the number of pages in Google's index. As the index gets bigger the real PageRank number increases.

I sort of agree, can we say that if the number of pages in the index increases at a faster rate than your inbound links then your PR will fall?

Mel
10-11-2004, 07:43 PM
Yes its an interesting theory, too bad its not based on fact.

bobmutch
10-11-2004, 08:05 PM
everyone: I am getting my list together for new PR10 sites and pages. Google Picked up 12 PR10 pages, NFS 2 PR10 pages and W3C.org 1 PR10 page. So 37 PR10 pages dropped down to PR9's and there are 15 new PR10 pages. I expecting to find a few more but this is all that my first scan came up with.
Sites:
Google Store Worldwide (http://www.google-store.com)
Pages:
Google Search Application - Contact (http://services.google.com/appliance/request_info/site)
Google Search Application (http://www.google.com/appliance)
Google Search Application - Customers (http://www.google.com/appliance/customers.html)Google Job Opportunties - Balance (http://www.google.com/jobs/balance.html)
Google Job Opportunties - Benefits (http://www.google.com/jobs/benefits.html)
Google Job Opportunties - Culture (http://www.google.com/jobs/culture.html)
Google Job Opportunties - Inside (http://www.google.com/jobs/inside.html)
Google Job Opportunties - Positions (http://www.google.com/jobs/positions.html)
Google Job Opportunties - Reasons (http://www.google.com/jobs/reasons.html)
Google Wireless Service (http://www.google.com/wireless)
Google Store Worldwide Privacy Policy (http://www.google-store.com/privacy.php)
Google Store Privacy Policy (http://www.googlestore.com/privacy.asp)
NFS - News (http://www.nsf.gov/home/menus/news.htm)
NFS - Search (http://www.nsf.gov/home/search.htm)
W3C Software License (http://www.w3.org/Consortium/Legal/2002/copyright-software-20021231)

Marcia
10-12-2004, 01:16 AM
OBSERVABLE FACT: Of course it may be an inane assumption based on imagination, but it seems that as the size of the index has increased we've seen a periodic across-the-board decrease in PR. It has happened a number of times.

This is, of course, based on historical Google behavior and observation, and the fact that it's been substantiated by considerable empirical evidence that it's happened - including the fact that Yahoo's PR has gone down to PR9 before.

Unfortunately there are a lot of webmasters who get alarmed when they lose some PR or theirs doesn't go up like they expected. Knowing that it's been an across_the_board thing at least explains to them that it's not because of something that's wrong with their site. That's really all that matters from their particular perspective.

NFFC
10-12-2004, 01:07 PM
>too bad its not based on fact

Hmm, you have a link to the "facts"?

bobmutch
10-12-2004, 01:36 PM
Marcia: I think that is a good way to look at it. I have put forth this view as a theory (i.e. what the real PR range is) and I have quoted from Markus Sobek as he holds the same theory to give my arguments some weight. Really there is no way to prove or disprove this theory on the range of real PR as Google doesn't tell us. We have seen 37 pages drop down to PR10's and this thread explains one of the theories as to why this has happens.

We had the same thing happen March 16th when 24 PR10 pages dropped to PR9s.

hinote
10-13-2004, 08:11 AM
On a general note, anyone have any stats of what % of websites are
PR4,....PR5, up to PR10.
Or, where would one go to look for this.?

bobmutch
10-13-2004, 01:52 PM
hinote: I have done lots of looking for any information on the internet on the numbers of page for different PR and found nothing.

Let me know if you ever come up with some thing.

JanT
10-22-2004, 04:19 PM
Mel: Don't you see the problem there? If the highest ranked page cannot have a PR equal to the number of pages in the index, then how can the PR range from 0.15 to the number of pages in the index???

bobmutch: No, I don't see a problem with that at all. I see no problem with there being in the real PR a top range that is never reached. That would mean that the toolbar PR10 range would be from N-x to N, with x being the range of of the real PR that the toolbar PR10 represents, and that no PR10 would reach the top part of that range. I see no problem with that at all.


To carify this, let's have a look at the algorithm:

PR(A) = (1-d) + d * ( PR(P2)/C(P2) + ... + PR(Pn)/C(Pn) ), with d=0.85
A is Pagerank of the Page we're interested in, P1..Pn the Pagerank of any other pagel, C the # of links the that page, n the number of pages in google

Consequence:

The minimum PageRank is (1-0.85) = 0.15.

The maximum PageRank is (1-0.85) + 0.85 * ( Sum of all other PageRank values ) = 0.15 + 0.85 * n. This requires the hole internet to be pages with only links pointing to the same single page A! Problem: if there are no links to any other page... how does Google know about them? Ok, they may all have been submitted directly, but that's not realistic ;-).

My assumption:

If the idea of PageRank is based on the idea of the random surfer model, the PageRank should correlate to the probability that a user visits the Page. So PR(A) ~ pageview(A) / pageviews of all pages on the net. And this is a value that is much easier to guess: Alexa tells us, that yahoo.com is responsible for about 8% of all pageviews. I think the mainpage of yahoo.com will be visited about every 100 visits of a user using yahoo (???), so the PageRank of Yahoo's mainpage should be n * (8/100) / 100 ~ 3.43 million. As yahoo.com has about 2.9 million backlinks this sounds quite good as the average pagerank of a page should be 1.

Regards,
Jan

NFFC
10-22-2004, 04:40 PM
Welcome aboard Jan, good to see you here.

>To carify this, let's have a look at the

Patents are good sources too.

"The rank assigned to a document is calculated from the ranks of documents citing it. In addition, the rank of a document is calculated from a constant representing the probability that a browser through the database will randomly jump to the document."

Patent here (http://patft.uspto.gov/netacgi/nph-Parser?Sect2=PTO1&Sect2=HITOFF&p=1&u=%2Fnetahtml%2Fsearch-bool.html&r=1&f=G&l=50&d=PALL&RefSrch=yes&Query=PN%2F6285999)

bobmutch
10-22-2004, 05:25 PM
JanT: To carify this, let's have a look at the algorithm I may of missed it but I don't see where you addressed the issue you were going to clarify which was what is the range of real PR.

The maximum PageRank is (1-0.85) + 0.85 * ( Sum of all other PageRank values ) = 0.15 + 0.85 * n. The maximum PageRank that is possible would be all pages in the Google index using a looped linking structure, then each of those pages having a link to a Main page. To simplify the equation we can use a structure that will product the same results but is easier to work out. Every page linking to a Main page with a return link. Of course you can't have 6 billion links on one page but this equation will be simple to form.
The maximum PageRank is (1-0.85) + 0.85 * ( Sum of all other PageRank values ) = 0.15 + 0.85 * n. I think you are off on the equation for maximum PR.

Working off the following formula:
PR = 0.15 + 0.85(PRa / La + PRb / Lb + ....)
Put the values in:
PR = 0.15 + 0.85(pr1 / 1 + pr2 / 1 + pr3 / 1 + pr4 / 1 + ..... + prN / 1)
simplify the equation (r2d2 producted the following simplified equation):
PR= (0.15N + (0.15/0.85)) / ((1/0.85)-0.85)
So lets say N is 5,880,000,000 (the number of times "the" is in the Google index) the maximum real PR would be 2,701,621,622

Problem: if there are no links to any other page... how does Google know about them? Ok, they may all have been submitted directly, but that's not realistic ;-). No problem at all. Using a Looped structure and then all the looped pages link to one Main page.
As yahoo.com has about 2.9 million backlinks this sounds quite good as the average pagerank of a page should be 1.
yahoo.com has 16,900,000 link to his home page and 18,000,000 in total to the domain. Where you getting your 2.9 million backlinks number? Also Yahoo's home page is a PR9 and it was known that it was not a strong PR10 pre-oct 5th as it had no PR10 subs. Adobe has the strongest PR10 site as they currently have 49 or more PR10 pages.

I am not sure what your conculasoin was, and I am not sure you clarifed anything.

Oh yes, welcome to the forum!

Mel
10-22-2004, 10:32 PM
To carify this, let's have a look at the algorithm:

PR(A) = (1-d) + d * ( PR(P2)/C(P2) + ... + PR(Pn)/C(Pn) ), with d=0.85
A is Pagerank of the Page we're interested in, P1..Pn the Pagerank of any other pagel, C the # of links the that page, n the number of pages in google

Consequence:

The minimum PageRank is (1-0.85) = 0.15.

The maximum PageRank is (1-0.85) + 0.85 * ( Sum of all other PageRank values ) = 0.15 + 0.85 * n. This requires the hole internet to be pages with only links pointing to the same single page A! Problem: if there are no links to any other page... how does Google know about them? Ok, they may all have been submitted directly, but that's not realistic ;-).

My assumption:

If the idea of PageRank is based on the idea of the random surfer model, the PageRank should correlate to the probability that a user visits the Page. So PR(A) ~ pageview(A) / pageviews of all pages on the net. And this is a value that is much easier to guess: Alexa tells us, that yahoo.com is responsible for about 8% of all pageviews. I think the mainpage of yahoo.com will be visited about every 100 visits of a user using yahoo (???), so the PageRank of Yahoo's mainpage should be n * (8/100) / 100 ~ 3.43 million. As yahoo.com has about 2.9 million backlinks this sounds quite good as the average pagerank of a page should be 1.

Regards,
Jan


I think that perhaps the problem is that Bob seems to assume that there is a limiting mechanism of some sort which says the maximum range of PR is thus and such. It seems to me clear that the maximum PR in use is the maximum that is actually assigned to the highest ranking page, not some theoretical limit based on an unknown formula.

JanT
10-22-2004, 10:45 PM
I may of missed it but I don't see where you addressed the issue you were going to clarify which was what is the range of real PR.
Ok, maybe my quote was not choosen correct. I think this one is better:

Mel wrote:If the average value of all PR across the web is one as the PR papers tell us, and we assume there are perhaps 6 billion pages in the Google index, then if one page has a PR of 6 billion, the average of all the other 5,999,999,999 pages has to be zero in order that the average remains one.
bobmutch wrote:There is no confusion on my side Mel so no reason to be sorry for something that is not there. It is impossible that one page have a real PR of 6 billion. You can't link 6 billion pages to one page. So that is a mote point and means nothing. It is impossible. The premise is impossible and your conculation not only makes no sense, but is based on a impossible premise. You have yet to show why the real PR range can't be from 0.15 to N. (N = total number of pages in the index.)


What I wanted to "clarify" is, that Mel is right because it's simple math and logic:
- If the average PR=1, the PageRank-sum of all N Pages will be N.
- If 1 page would have a PR of N, all other pages must have a PR of 0, because PR-of-this-singe-page + all-other-page*their-PR = sum-of-all-PRs or with numbers N + (n-1)*x = N is only possible with x=0 for x>0. That's all Mel tried to say.

I tried to explain that without great calculations and only with the first look at the formula it's obvious that a PR cannot get over 0.85*N (for big N) and not from 0.15 up to N as you told... but anyhow.

Now to your formula. I don't know how r2d2 simplified the equation, maybe you should repair him ;-). You said, that the maximum PR will be (0.15N + (0.15/0.85)) / ((1/0.85)-0.85), but that's not right. I'll take your example (Using a Looped structure and then all the looped pages link to one Main page), only in smaller scale:

Consider the following 6 pages:
A links to B + X
B links to C + X
C links to D + X
D links to E + X
E links to A + X
X links to X
Now we have a loop (a->b->c->d->e->a) and all pages linking to the main page X. If you solve the system of equations you get
- A,B,C,D,E have a PR of about 0.261 (0.261 = 0.15 + 0.85 * (0.261/2))
- X has a PR of about 4.69 (4.69 = 0.15 + 0.85 * (5 * 0.261/2 + 4.69/1)).
Your formular for N=6 tells me that the maximum possible PageRank is: (0.15N + (0.15/0.85)) / ((1/0.85)-0.85) = (0.15*6 + 0.176) / 0.326 = 3.3! But page X got a PR of 4.69!? (My formula was 0.15+0.85*N=5.25)

To my numbers: you're right, I don't know where I got the 2.9 million from... maybe I looked at the wrong row ;-(. But it doesn't really matter... the magnitude of the highest actual PageRank in Google will be around 1-50 million.

bobmutch
10-23-2004, 12:24 AM
JanT: What I wanted to "clarify" is, that Mel is right because it's simple math and logic:
- If the average PR=1, the PageRank-sum of all N Pages will be N.
- If 1 page would have a PR of N, all other pages must have a PR of 0, because PR-of-this-singe-page + all-other-page*their-PR = sum-of-all-PRs or with numbers N + (n-1)*x = N is only possible with x=0 for x>0. That's all Mel tried to say.Well I would agree with you that Mel is correct in the statement you quoted. But that was not the issue Mel and I were discussing, at least not in my mind. That was only a fact that he was used to support his theory that the range of real PR can't be from 0.15 to N.

If you read back you will see that I have no problem with the simple concept that Mel stated that you quoted. That is that the average value of all pages is 1 and that if one page in an index of 6 billion pages had a real PR of 6 billion all the other pages would have to be zero.

What we did disagree on was the range of real PR. I have maintained, right or wrong, that the range of real PR is 0.15 to N. But while maintaining that position I have clearly noted that the upper range doesn't have any pages in it. I am not yet clear on what Mel maintains the range is. It appears to me that he may think it is some where around 0.15 to 30 million, or perhaps that we can't really know.
I tried to explain that without great calculations and only with the first look at the formula it's obvious that a PR cannot get over 0.85*N (for big N) and not from 0.15 up to N as you told... but anyhow. You may want to read the thread over again. I know that PR can't get over 0.85N. I have always maintained that.
Now to your formula. I don't know how r2d2 simplified the equation, maybe you should repair him ;-). You said, that the maximum PR will be (0.15N + (0.15/0.85)) / ((1/0.85)-0.85), but that's not right. I didn't say the maximum PR will be "(0.15N + (0.15/0.85)) / ((1/0.85)-0.85)", I did say IF you have 5,880,000,000 pages and then are all linked to one main page and the main pages links back to them, the real PR of the main page would be (0.15N + (0.15/0.85)) / ((1/0.85)-0.85) .
Let me show the formula and the process that was used to simplify the formula.
Starting with this formula:
PR = 0.15 + 0.85(PRa / La + PRb / Lb + ....)
where PRa is the PR of the first linking page and La is the number of links on that page. PRb and Lb refer to the second page that links to it, and so on for c etc.
So, if we have a system where there are N+1 pages on the internet, and every page links to one page, and that page links to all the others, the equations for the main page will look like this:
PR = 0.15 + 0.85(pr1 / 1 + pr2 / 1 + pr3 / 1 + pr4 / 1 + ..... + prN / 1)
where PR is the PR of the main page, and pr1, pr2, pr3, prN are the PR of all the normal pages that links to the main page. They are divided by 1 to show there is only 1 link on each page. It should be clear that pr1, pr2, pr3, etc will all be identical, so:
PR = 0.15 + 0.85(N x pr)
where pr is the PR of each linking page. Since there are N pages linking to the main page (N linking pages, plus 1 receiving page = N + 1).
The PR of the N pages that link to the main page will be:
pr = 0.15 + 0.85(PR / N)
where pr is the PR of the linking pages, PR is the PR of the main page, and N is the number of links it has on it.
Now, we have two equations involving two variables. We just need to rearrange the equations so we can add/subtract one from the other to cancel out 1 variable and find a solution for the other, so we can plug the solution back in to one of the equations, and get the solution for the other variable.
So, the equations we have are:
PR = 0.15 + 0.85(N x pr)
pr = 0.15 + 0.85(PR / N)
Lets take the first one:
PR = 0.15 + 0.85(N x pr)
and move the term on the right
PR - 0.85( N x pr) = 0.15
now lets divide by 0.85
PR / 0.85 - (N x pr) = 0.15 / 0.85
Now thats in a form we like. Lets take the second one:
pr = 0.15 + 0.85(PR / N)
and move the right hand term across, and divide both sides by -1
0.85(PR / N) - pr = -0.15
now lets multipy through by N
0.85PR - N x pr = -0.15N
Excellent, now we have both
PR / 0.85 - N x pr = 0.15 / 0.85
0.85PR - N x pr = -0.15N
Lets do equation 2 minus equation 1:
0.85PR - PR / 0.85 - N x pr - (-N x pr) = -0.15N - 0.15 / 0.85
cancel the N x pr and:
0.85PR - PR / 0.85 = -0.15N - 0.15 / 0.85
take out the factor PR on the left hand side:
PR(0.85 - 1 / 0.85) = -0.15N - 0.15 / 0.85
Divide both sides by (0.85 - 1 / 0.85) :
PR = - ( 0.15N + 0.15 / 0.85 ) / (0.85 - 1 / 0.85)
Swap the terms on the bottom to cancel the negative sign:
PR = (0.15N + 0.15 / 0.85 ) / (1 / 0.85 - 0.85)

Do you still see any problems with the math?
If there is no problem with the math then if I use the number 5,880,000,000 for the number of pages in the index (it is more but I had to select some number) the maximum real PR that could be given in this model to the main page would be 2,701,621,622.

Now I am not saying that is going to happen. I am just noting that IF you linked all pages to a main page and the main page linked back and IF the number of pages in the index was 5,880,000,000 then the main page would have a real PR of 2,701,621,622.Now we have a loop (a->b->c->d->e->a) and all pages linking to the main page X. If you solve the system of equations you getThat is not what I said. If you link A>B>C>D>E>X and don't link X back to A or some other page, then X is an orphan and Google will not give it any PR. If you do a looped link structure on A>B>C>D>E>X>A all pages will have a PR of 1. If you then add a link links B>X, C>X, D>X, E>X, you are going to get a different number than from the equation that I posted. The equation that I posted was for all pages linking to X and X linking back, N>X, X>N.

Further I should note that the maximum PR in use, is the maximum real PR that is voted to the highest ranking page and this can't be effected by my theoretical maximum PR based on the Brin/Page PR formula. I thought the theoretical maximmum formula was interesting, and that is why I posted it. I was just working out theoretically the link structure that would give a page the highest possible real PR (with out producing ophans of course) and then to work out what the PR would be using the Brin/Page PR formula.

I hope this clarifies my position a bit better for you JanT.

bobmutch
10-27-2004, 11:14 PM
Well finial stats are in and down. There were 26 new PR10 pages producted in the Oct 5th toolbar PR update and 37 dropped.

Strangely I found 12 redirected PR10 pages that where not recorded on my PR 10 Pages page. If you throttle down your connect to about 4KB/sec you will be able to see the PR of the orginal domain come up, and as the redirect starts to move the URL to the new domain the PR bar goes to PR0 and when the new domain comes up you get a PR10 again. If you do this on DSL or faster it is hard to see this.

The domains are as follows.
http://www.acrobat.com
http://www.allaire.com (http://www.allaire.com/)
http://www.bubel.com
http://www.dejanews.com/
http://www.glamourgals.net (http://www.glamourgals.net/)
http://www.livesoftware.com (http://www.livesoftware.com/)
http://www.prognet.com (http://www.prognet.com/)
http://www.quicktime.com (http://www.quicktime.com/)
http://www.sprinks.com (http://www.sprinks.com/)
http://www.valto.com (http://www.valto.com/)

My new PR 10 Pages (http://www.seocompany.ca/pagerank/pr-10-pages.php) list now contains 148 PR10 domains.