View Full Version : http:// what relevancy are they measuring in SERPs Rankings
Doing a Google search for the keyword
http or http://
There are many theories as to what the SERPs results are based on
and what they measure
What are yours? :confused:
__________________________________________
http://www.webmasterbrain.com/prog/search?hl=en&ie=UTF-8&oe=UTF-8&q=http%3A%2F%2F&num=100
What are your opinions of the Google Pagerank SERPs Web site?
SEO Guy
06-09-2004, 07:38 PM
Ask a vague question you get a vague answer
Doing a Google search for the keyword
http or http://
There are many theories as to what the SERPs results are based on
and what they measure
What are yours? :confused:
__________________________________________
http://www.webmasterbrain.com/prog/search?hl=en&ie=UTF-8&oe=UTF-8&q=http%3A%2F%2F&num=100
What are your opinions of the Google Pagerank SERPs Web site?
Not all that bad a question, IMO. We know that google has several different processes which result in rankings, comprised of at least the relevacy portions and the PR portions.
The relevacy of a term which all sites must have should therefor be equal and thus the rankings should be listed in order of true PageRank, but this is not strictly true in this case, which is why I think it is a good question and one for which I have no answer, but it would seem to indicate that there is a portion of the relevancy rankings which attach some different values to URLs.
Equally interesting perhaps, is why does a search for http only return 535 million results if Google have 4.3 billion pages in their index, while a search for the word the returns 5.5 billion results and Google says it is now searching 4,285,199,774 pages???
Dodger
06-10-2004, 04:47 AM
...it would seem to indicate that there is a portion of the relevancy rankings which attach some different values to URLs.
Like authority sites maybe? (and no I won't mention Apple .... hehehe). Do you think that this might be the case here?
Equally interesting perhaps, is why does a search for http only return 535 million results if Google have 4.3 billion pages in their index, while a search for the word the returns 5.5 billion results and Google says it is now searching 4,285,199,774 pages???
Have you ever ran across those pages that are marked supplemental? Perhaps 20% of the Google index is supplemental results. The only way you can see them is by doing a site search (their may be other ways too), but they do not show up in normal results.
Some of these supplemental pages, or I should say a large majority of them are doorway pages or possibly mirrors. Feasibly they do hang onto these pages, but they are just not used in the normal results.
Dodger
06-10-2004, 04:55 AM
Here is another cool thing to ponder. You will need IE5+ to view this at http://ranking.thumbshots.com/ which is a ranking tool that compares the top 100 results of two engines.
Try this query in the tool, search for http:// and www using just Google for both engines (comparison of terms in one engine really). They pretty much match up near the top and jumble more at the bottom, 85 of the top 100 are in both result sets.
I don't know what good this could be to know, but it is something interesting to think about.
SEO Guy
06-10-2004, 05:06 AM
Have you ever ran across those pages that are marked supplemental? Perhaps 20% of the Google index is supplemental results. The only way you can see them is by doing a site search (their may be other ways too), but they do not show up in normal results.
Some of these supplemental pages, or I should say a large majority of them are doorway pages or possibly mirrors. Feasibly they do hang onto these pages, but they are just not used in the normal results.
Actually this doesnt make sense when differentiating between "http" and "the" why would they report sites for "the" and decide not to count sites for "http"
I actually have another theory and it comes back to link text
A searh for the kw "www" reports www.google.com on the first page, yet google.com does not mention "www" anywhere on its page, so why does it rank? One of 2 reasons, #1 because teh www in its URL is counted and it ranks on just that or #2: People link to google using www.google.com as the anchor text, this is the more likely answer.
Now that still has a uncontrolled variable because of the www in the URL so lets go back to a different term ( I cant find a site that ranks for "the" without mentioning it so I will use the kw "is" check out http://www.anfyteam.com/ and let me know where you see the word "is" mentioned because the site ranks top 20 for "is" out of 2 billion sites, not bad for a site that doesnt even mention the term. Take a look at the allinanchor though and you see them in the exact same postition http://www.google.com/search?sourceid=navclient&ie=UTF-8&oe=UTF-8&q=allinanchor%3Ais
OK SEO Guy spit it out what the hell is the point?
The point is sites rank because #1 they have links pointing to them with the kw in the anchor or #2 because they are mentioned onpage.
"the" shows up with more results then http because (I believe) more people either link to sites using the kw "the" in the anchor or use the word "the" onpage vs http
Am I crazy?
Dodger
06-10-2004, 05:17 AM
Equally interesting perhaps, is why does a search for http only return 535 million results if Google have 4.3 billion pages in their index, while a search for the word the returns 5.5 billion results and Google says it is now searching 4,285,199,774 pages???
I think I figured the http:// out. It is because that term appears in links pointing to the pages in the results. As in straight Url links (no text). For example http://www.microsoft.com where they used the url for the anchor text.
There are 3.8 million pages that contain the term (http://www.google.com/search?num=50&hl=en&lr=&ie=UTF-8&c2coff=1&q=%22%2Bwww.microsoft.%2Bcom/%22) www.microsoft.com How much you wanna bet a majority of those are pure links?
SEO Guy
06-10-2004, 05:23 AM
I think you just supported my point, but meh what do I know lol Im CRAZY! muahahaha :eek: ok SEO Guy go to bed now your talking to yourself again
Night!
Anthony Parsons
06-10-2004, 05:32 AM
Wack Wack, Casavac, Your Out ! !
Dodger
06-10-2004, 05:35 AM
Take a look at the allinanchor though and you see them in the exact same postition http://www.google.com/search?sourceid=navclient&ie=UTF-8&oe=UTF-8&q=allinanchor%3Ais
That is because of a little button that is on the sites that they do. It is entitled "This site is Anfy Enhanced" in the ALT parameter of the Img element. Here is one site (http://www.lehigh.edu/~mhm4/) as an example. Look at the bottom of the page. I guess this guy is pretty well known around Europe too.
"the" shows up with more results then http because (I believe) more people either link to sites using the kw "the" in the anchor or use the word "the" onpage vs http
I think the "is" example kind of supports that theory. But what gets me is why do they consider it in an anchor text for results, but not normal queries? What is the difference whether it is in an anchor or not. Standard practices (or some say) to not use stop words if you can help it. This goes against that logic in the case of anchor text -- it is considered by Google anyway (so it would seem).
Am I crazy?
Yep. But you are going to have to get in line though dude -- we are stocked up on crazy here. :eek:
Yep so why does Google return 5.5 billion results for the word "the" when they say they are only searching 4.3 billion pages?
Are you saying there are 2.2 billion additional pages in the supplemental index? If so and if Google searches them routinely (??) then why don't they count them as being in the the pages they search?
Or is this just Google messin' with our heads again?
Dodger
06-10-2004, 06:42 AM
Yep so why does Google return 5.5 billion results for the word "the" when they say they are only searching 4.3 billion pages?
Are you saying there are 2.2 billion additional pages in the supplemental index? If so and if Google searches them routinely (??) then why don't they count them as being in the the pages they search?
Or is this just Google messin' with our heads again?
That would be 1.2 billion supplemental pages (roughly 20%). I don't see that as unreasonable, do you? Let's look at not just doorways (probably not too many the's on them anyway), but I would imagine there are penalized pages to consider, mirrors, and possibly some dynamic pages (such as forum pages with multiple Urls).
Then again ... they could just be messing with our heads.
...
The point is sites rank because #1 they have links pointing to them with the kw in the anchor or #2 because they are mentioned onpage.
"the" shows up with more results then http because (I believe) more people either link to sites using the kw "the" in the anchor or use the word "the" onpage vs http
Am I crazy?
As SEO-Guy points out a simple explaination could be that some of these sites use http on the page too, and thus get on-page credit for that; additionally they will all have a varying amount of inbound links using the term http, so there is a plausible explanation
But my other question was not why are there more results for the than for http, but why are there more results than pages they claim to index?
Does the fact that they have a few billion pages in the index but only report 535 million mean that they do not count terms in the URL?
Dodger
What I see in the supplemental result is not duplicate or doorway pages, but ordinary pages from ordinary sites that have some of thier pages in the main index and some in the supplemental index.
Dodger
06-10-2004, 07:24 AM
... why are there more results than pages they claim to index?
Does the fact that they have a few billion pages in the index but only report 535 million mean that they do not count terms in the URL?
Oh hell Mel, I am sorry. I misunderstood you question in the first place. I was thinking backwards for some reason.
The question could be how accurate the figure on the front page is. Has anyone actually watched that thing tick up?
Another question, does "results" equate to "pages"? Can supplemental+pages=results ??? The later is possible.
Google being Google would want an accurate accounting of what they consider a legitimate page count -- supplemental pages would not (or should not) count in that figure. More to the point, they are NOT searched ... but they would have to be included in the results (wouldn't they ???). I am not making sense, I know.
What I see in the supplemental result is not duplicate or doorway pages, but ordinary pages from ordinary sites that have some of thier pages in the main index and some in the supplemental index.
The ones I have seen were. But I was checking for other reasons specific to doorway pages that were spamming the Yahoo index (80 of the top 100 were one site when it was traced back). In the Google index, they were all supplemental and did not show up at all in the search results there. ;-)
In that case though -- I guess you could consider them to be duplicate content too, which they were.
I would imagine there are other reasons to categorize pages as supplemental too. Maybe there is some onpage factors that are basicly covered elsewhere in the site in one fashion or another -- redundancy maybe? If that is the case, then maybe they might hold some secrets as to what Google considers to be redundant.
A prime case of redundancy is those LinksManager pages when their users abuse the system and crank out 100's of pages per site that are almost identical. They even use rotation, but they never amount to anything with Google and mostly are PR0 across the board.
SEO Guy
06-10-2004, 03:26 PM
AHAHAHHA I missed the damn question too hehe, ah well
Honestly the results # is not an absolute # its an estimation, just like when you check backlinks and you will see 575 backlinks to a site, go through all omited results and you only find 400. This is because (Just a theory people) it takes much less resources to estimate then to count.
The 4. whateever billion is likely the last count but as if they are going to count how many pages in the DB every time someone makes a query :cool:
Amazon has been #5 for a while :confused:
It uses a lengthy URL that few sites would point to with the EXACT URL- but it IS IN google's SERPs with a PageRank 0
http://www.google.com/search?hl=en&ie=UTF-8&q=http
http://www.amazon.com/exec/obidos/subst/home/home.html
PageRank 9
DEFAULTS to
http://www.amazon.com/exec/obidos/subst/home/home.html/103-2474869-8272660
PageRank 0
as does
http://www.m-w.com/dictionary.htm - at #13
A PageRank 0 the way it appears on Google SERPs
A PageRank 8 the way most people would link to it
Dodger
06-24-2004, 12:13 AM
Well they are using robot detection to turn of Session Tracking. That extra amount of numbers on the tail end of your Url is your session ID number. I don't think you can call that anything revelational.