Search Engine Watch
SEO News

Go Back   Search Engine Watch Forums > Search Engines & Directories > Google > Google Web Search
FAQ Members List Calendar Forum Search Today's Posts Mark Forums Read

Reply
 
Thread Tools
Old 11-10-2004   #1
Chris Sherman
Executive Editor, SearchEngineWatch.com
 
Join Date: Jun 2004
Location: Boulder, CO
Posts: 111
Chris Sherman is a jewel in the roughChris Sherman is a jewel in the roughChris Sherman is a jewel in the roughChris Sherman is a jewel in the rough
It's Official: Google Now Searching 8,058,044,651 web pages

Moments ago, Google quietly changed the number of pages it's reporting on its home page. Now "searching 8,058,044,651 web pages.". According to spokesperson Nate Tyler, these are "real pages," meaning they've been fully indexed. This suggests that the total number of "items" may exceed 10 billion, if you count images, groups postings and pages inferred from links.

Don't expect Yahoo or Microsoft to counter with larger numbers any time soon--even if they do increase index size. Instead, expect statements along the lines of "our search results are competitive because they are high quality," which ultimately is both a valid assertion and the only thing that matters in the long run.
Chris Sherman is offline   Reply With Quote
Old 11-10-2004   #2
AussieWebmaster
Forums Editor, SearchEngineWatch
 
AussieWebmaster's Avatar
 
Join Date: Jun 2004
Location: NYC
Posts: 8,153
AussieWebmaster has a brilliant futureAussieWebmaster has a brilliant futureAussieWebmaster has a brilliant futureAussieWebmaster has a brilliant futureAussieWebmaster has a brilliant futureAussieWebmaster has a brilliant futureAussieWebmaster has a brilliant futureAussieWebmaster has a brilliant futureAussieWebmaster has a brilliant futureAussieWebmaster has a brilliant futureAussieWebmaster has a brilliant future
Quote:
Originally Posted by Chris Sherman

Don't expect Yahoo or Microsoft to counter with larger numbers any time soon--even if they do increase index size. Instead, expect statements along the lines of "our search results are competitive because they are high quality," which ultimately is both a valid assertion and the only thing that matters in the long run.
And the only counter they have.... the addition of more pages does not automatically mean more relevance... though it should allow for more variety and with tight filtering and sorting will offer more variety.
AussieWebmaster is offline   Reply With Quote
Old 11-10-2004   #3
craig34
Newbie
 
Join Date: Jun 2004
Location: Woburn, MA
Posts: 3
craig34 is on a distinguished road
How accurate?

Google is now reporting 57,000 pages for my site. Considering I was at 26,000 earlier in the week, this is awesome news - except for one thing. I know for a fact that my site doesn't contain more than 30,000 pages at the very most...
craig34 is offline   Reply With Quote
Old 11-10-2004   #4
mcanerin
 
mcanerin's Avatar
 
Join Date: Jun 2004
Location: Calgary, Alberta, Canada
Posts: 1,564
mcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond repute
How many images on that site are indexed, craig?

You would not happen to have about 25,000, would you? I'm starting to wonder if G is counting an image in it's image search as a "page"....

Random thought, no proof. Just wondering.

Ian
__________________
International SEO
mcanerin is offline   Reply With Quote
Old 11-10-2004   #5
craig34
Newbie
 
Join Date: Jun 2004
Location: Woburn, MA
Posts: 3
craig34 is on a distinguished road
There's no way I have 25,000 separate images. Maybe 5,000 max. Good thought though.
craig34 is offline   Reply With Quote
Old 11-10-2004   #6
bobmutch
seocomapny.ca|Project Support Open Source|Top 40 Dirs rated by Inbound Link Quality
 
Join Date: Aug 2004
Location: london.on.ca
Posts: 575
bobmutch has a spectacular aura aboutbobmutch has a spectacular aura about
Just dropped down to 8,000,000,000 now when you search for the word "the". Nov 10 2300 GMT-5.
bobmutch is offline   Reply With Quote
Old 11-10-2004   #7
projectphp
What The World, Needs Now, Is Love, Sweet Love
 
Join Date: Jun 2004
Location: Sydney, Australia
Posts: 449
projectphp is a splendid one to beholdprojectphp is a splendid one to beholdprojectphp is a splendid one to beholdprojectphp is a splendid one to beholdprojectphp is a splendid one to beholdprojectphp is a splendid one to beholdprojectphp is a splendid one to behold
42,000,000 pages that have the word *a* but not the word *the*.

So, that is 8,042,000,000
projectphp is offline   Reply With Quote
Old 11-10-2004   #8
Robert_Charlton
Member
 
Join Date: Jun 2004
Location: Oakland, CA
Posts: 743
Robert_Charlton has much to be proud ofRobert_Charlton has much to be proud ofRobert_Charlton has much to be proud ofRobert_Charlton has much to be proud ofRobert_Charlton has much to be proud ofRobert_Charlton has much to be proud ofRobert_Charlton has much to be proud ofRobert_Charlton has much to be proud ofRobert_Charlton has much to be proud of
http://www.google.com/googleblog/200...y-doubles.html

Hmmm... the afternoon before the new MSN Search beta goes live.

I'm not yet seeing any big ranking shifts in areas I monitor, and sandboxed sites are still sandboxed. But soon? Can they be on a 64 bit architecture yet???

Last edited by Robert_Charlton : 11-10-2004 at 11:27 PM.
Robert_Charlton is offline   Reply With Quote
Old 11-11-2004   #9
bobmutch
seocomapny.ca|Project Support Open Source|Top 40 Dirs rated by Inbound Link Quality
 
Join Date: Aug 2004
Location: london.on.ca
Posts: 575
bobmutch has a spectacular aura aboutbobmutch has a spectacular aura about
Projectphp:

8,000,000,000 the
41,900,000 -the a
29,000,000 -the -a to
35,300,000 -the -a -to de
37,100,000 -the -a -to -de 1
22,900,000 -the -a -to -de -1 2
18,700,000 -the -a -to -de -1 -2 3
14,900,000 -the -a -to -de -1 -2 -3 4
19,600,000 -the -a -to -de -1 -2 -3 -4
23,800,000 -the -a -to -de -1 -2 -3 -4 com

~8,243,200,000 in the index form latin charactor words and numbers

Last edited by bobmutch : 11-11-2004 at 01:53 PM.
bobmutch is offline   Reply With Quote
Old 11-11-2004   #10
GoogleGuy
Unofficial Representative
 
Join Date: Jul 2004
Location: Mountain View, CA
Posts: 66
GoogleGuy is a glorious beacon of lightGoogleGuy is a glorious beacon of lightGoogleGuy is a glorious beacon of lightGoogleGuy is a glorious beacon of lightGoogleGuy is a glorious beacon of lightGoogleGuy is a glorious beacon of light
So in case anyone was still wondering, we're not limited by four-byte docids. But I suppose that was pretty clear.
GoogleGuy is offline   Reply With Quote
Old 11-11-2004   #11
GoogleGuy
Unofficial Representative
 
Join Date: Jul 2004
Location: Mountain View, CA
Posts: 66
GoogleGuy is a glorious beacon of lightGoogleGuy is a glorious beacon of lightGoogleGuy is a glorious beacon of lightGoogleGuy is a glorious beacon of lightGoogleGuy is a glorious beacon of lightGoogleGuy is a glorious beacon of light
By the way..

If you missed it, the Google blog also discusses the new Google Advertising Professionals program as well. That could be interesting to the folks on this forum..

I'm helping out the Google blog folks a little bit. If there's any particular topic you want me to talk about feel free to post. Or maybe we should start a separate thread? I'm still getting the hang of this new-fangled forum.
GoogleGuy is offline   Reply With Quote
Old 11-11-2004   #12
seobook
I'm blogging this
 
Join Date: Jun 2004
Location: we are Penn State!
Posts: 1,943
seobook is a name known to allseobook is a name known to allseobook is a name known to allseobook is a name known to allseobook is a name known to allseobook is a name known to all
Quote:
Originally Posted by GoogleGuy
I'm helping out the Google blog folks a little bit. If there's any particular topic you want me to talk about feel free to post.
any and all upcomming algorithm shifts...perhaps a subscribe feature with one month advanced notification?

notice that this index size increase coincides with news from MSN Search. you guys don't just time it that way to spoil the news for the other search engines do you?
__________________
The SEO Book
seobook is offline   Reply With Quote
Old 11-11-2004   #13
bobmutch
seocomapny.ca|Project Support Open Source|Top 40 Dirs rated by Inbound Link Quality
 
Join Date: Aug 2004
Location: london.on.ca
Posts: 575
bobmutch has a spectacular aura aboutbobmutch has a spectacular aura about
All the things I wanted to know but never had a chance to ask!

GoogleGuy: I have lost many sleepless nights over the below 4 questions. If you would give me any kind of hints, ideas or even answer I will promise to jump up and down with joy!

1. Why is there a drop in PR when there is a toolbar PR update? (37 of the 152 sites on my PR10 pages list dropped off on the Oct 5th update).

2. What is the real PR range of the toolbar PR scale. From 0.15 to the real PR of the highest PR10 page?

3. During June 22 to Oct 5th when there was no toolbar PR update for 106 days, there were 4 BL updates. Would there of been a real PR update at the BL update times? I keep a list of the BL/GD/TB/Algo updates.

4. The scale of the Google Directory PR is it a scale with 7 units or 8 units?
cleardot.gif 5/35, 11/29, 16/24, 22/18, 27/13, 32/8, 38/2 (pos.gif/neg.gif).
And why does google.com have a GD PR of 44/0? Note link below in GD where www.google.com has pos.gif 44.
http://directory.google.com/Top/Comp...ngines/Google/

Thanks!

Last edited by Marcia : 11-11-2004 at 03:38 AM. Reason: Removed unnecessary URL drops.
bobmutch is offline   Reply With Quote
Old 11-11-2004   #14
Robert_Charlton
Member
 
Join Date: Jun 2004
Location: Oakland, CA
Posts: 743
Robert_Charlton has much to be proud ofRobert_Charlton has much to be proud ofRobert_Charlton has much to be proud ofRobert_Charlton has much to be proud ofRobert_Charlton has much to be proud ofRobert_Charlton has much to be proud ofRobert_Charlton has much to be proud ofRobert_Charlton has much to be proud ofRobert_Charlton has much to be proud of
Quote:
Originally Posted by GoogleGuy
So in case anyone was still wondering, we're not limited by four-byte docids. But I suppose that was pretty clear.
Sure... everything's been pretty clear for the last six months.
Robert_Charlton is offline   Reply With Quote
Old 11-11-2004   #15
rustybrick
 
rustybrick's Avatar
 
Join Date: Jun 2004
Location: New York, USA
Posts: 2,810
rustybrick has much to be proud ofrustybrick has much to be proud ofrustybrick has much to be proud ofrustybrick has much to be proud ofrustybrick has much to be proud ofrustybrick has much to be proud ofrustybrick has much to be proud ofrustybrick has much to be proud of
Quote:
Originally Posted by GoogleGuy
I'm helping out the Google blog folks a little bit. If there's any particular topic you want me to talk about feel free to post. Or maybe we should start a separate thread? I'm still getting the hang of this new-fangled forum.
Why subject yourself to such torture?
rustybrick is offline   Reply With Quote
Old 11-11-2004   #16
iamrussell
Member
 
Join Date: Sep 2004
Posts: 33
iamrussell is on a distinguished road
GoogleGuy,

Please go and flip off the sandbox switch. I know there must be one. I picture it as a giant electrical throw switch. Thank you.
iamrussell is offline   Reply With Quote
Old 11-11-2004   #17
NetinsertGuy
Organize the web - www.Netinsert.com
 
Join Date: Jun 2004
Posts: 11
NetinsertGuy is on a distinguished road
page count doubled

Quote:
Google is now reporting 57,000 pages for my site. Considering I was at 26,000 earlier in the week, this is awesome news - except for one thing. I know for a fact that my site doesn't contain more than 30,000 pages at the very most...
I too have noticed a similar rise in the page count. I doubt that the page count number is accurate. We don't have that many categories in the directory.
NetinsertGuy is offline   Reply With Quote
Old 11-11-2004   #18
Nacho
 
Nacho's Avatar
 
Join Date: Jun 2004
Location: La Jolla, CA
Posts: 1,382
Nacho is a splendid one to beholdNacho is a splendid one to beholdNacho is a splendid one to beholdNacho is a splendid one to beholdNacho is a splendid one to beholdNacho is a splendid one to beholdNacho is a splendid one to beholdNacho is a splendid one to behold
Someone over there must be saying, "if we could only let our crawlers breath for a little bit".
Nacho is offline   Reply With Quote
Old 11-11-2004   #19
Everyman
Member
 
Join Date: Jun 2004
Posts: 133
Everyman is a jewel in the roughEveryman is a jewel in the roughEveryman is a jewel in the rough
Quote:
So in case anyone was still wondering, we're not limited by four-byte docids. But I suppose that was pretty clear.
It's true. There are so many things broken now that the old docID theory alone cannot possibly explain what's going on at the Googleplex.

My site has 129,000 pages. Google reports 173,000 with the site: command. All images are in a disallowed directory, so they don't count. This site has been very stable for over a year now, with less than three percent variation in total pages, or the content of those pages.

If I use the site: command and exclude some words that are on nearly every page, Google reports 171,000 URL-only links. If use the site command and include the same words, I get 84,700 fully-indexed pages.

A couple months ago I decided that all numbers reported by Google over 1,000 are utterly unreliable and meaningless. The same is true at Yahoo. So I have a secret word that I use with the site: command that should bring up 765 pages when the site is fully indexed. I've been tracking the percentage of inclusion for Google, Yahoo, and Microsoft using this secret word.

Google: 71 percent (it was the same in September, lower in October, and now is back to September levels).

Yahoo: Between 89 and 95 percent, and much more stable than Google.

Old MSN (Yahoo's crawler): 80 percent, and very stable.

New MSN beta: 21 percent. (This sucks, but at least they grab the top level site map pages instead of randomly grabbing deeper pages.)

Unfortunately, the 8 billion figure will convince all the pundits, even if it convinces no webmasters. Now that Google is a public company, the uninformed Wall Street pundits are the only people who matter.
Everyman is offline   Reply With Quote
Old 11-11-2004   #20
mcanerin
 
mcanerin's Avatar
 
Join Date: Jun 2004
Location: Calgary, Alberta, Canada
Posts: 1,564
mcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond repute
Just wanted to point out that although you could probably use search words like "the" and "a" to preform apples to apples comparisions between SE's there are many completely non-latin alphabet pages that would not show up that are indexed.

For example, this very visible, well indexed PR8 site and most sites linked from it do not contain "a" or "the", etc on it.

http://cn.yahoo.com/

Just a warning about assuming that the pages you understand are the same as the pages Google indexes. The "the" search is a good general measurment, but I would be careful about stating anything that sounded like it was any where near accurate.

Cheers,

Ian
__________________
International SEO
mcanerin is offline   Reply With Quote
Reply


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off