Search Engine Watch
SEO News

Go Back   Search Engine Watch Forums > Search Engines & Directories > Google > Google Web Search
FAQ Members List Calendar Forum Search Today's Posts Mark Forums Read

Reply
 
Thread Tools
Old 12-21-2005   #1
ReSiever
Member
 
Join Date: Jun 2005
Posts: 27
ReSiever is on a distinguished road
New Google "Bigdaddy" Infrastructure Live, Data Center Open For Feedback

Hello all,

I've read something about this datacenter a few days ago, and i'm not sure if it was at SEW, but I think it is so important so spend my time on it here...

At this very moment, i'm seeing results at http://64.233.179.104 that got spidered only by the Mozilla Google Bot. The Mozilla bot is known not te be responsible for Google's main index, and my guts always told me it was also checking duplicate content, cause it got my pages out of Google's normal index.

I know that the content of some of my pages really get close to each other, so I figured they would get stuck in the duplicate content filter. After the Mozilla bot got around, they did, so that is why I think it also covers duplicate content problems.

But, now that i have some new pages launched i know that they only got spidered by the Mozilla bot. And what do you know? They all are indexed in http://64.233.179.104, but are not known in any normal datacenter. Next to that, I don't see as much portals as I see on other datacenters.

And, as Matt Cutts once stated: "the test data center certainly has some different crawling and indexing characteristics.", according to SERoundtable's interesting item about this discovery.

I also haven't seen supplemental results, or URL-only results... So, does anyone else have something so say about this? Maybe this really is a test datacenter? Anybody got any news on this one?
ReSiever is offline   Reply With Quote
Old 12-21-2005   #2
bhartzer
Search Engine Optimization, Search Engine Marketing Expert
 
Join Date: Jun 2004
Location: Dallas, Texas
Posts: 533
bhartzer has a spectacular aura aboutbhartzer has a spectacular aura aboutbhartzer has a spectacular aura about
If you do a search for that IP address on Google you'll notice quite a few posts regarding the subject. Matt Cutts recently said:

Quote:
I do expect that data center to eventually go live, but it will take a few months, in all likelihood.
__________________
Bill Hartzer is an internet marketing consultant in Dallas and has been practicing organic SEO since 1996.
bhartzer is offline   Reply With Quote
Old 12-21-2005   #3
ReSiever
Member
 
Join Date: Jun 2005
Posts: 27
ReSiever is on a distinguished road
I did read that. But the interesting part is, sometimes the datacenter shows the same results as other datacenters. Maybe they sometimes throw it in the regular results, just to measure user behaviour or something? Just guessing...
ReSiever is offline   Reply With Quote
Old 12-21-2005   #4
dannysullivan
Editor, SearchEngineLand.com (Info, Great Columns & Daily Recap Of Search News!)
 
Join Date: May 2004
Location: Search Engine Land
Posts: 2,085
dannysullivan has much to be proud ofdannysullivan has much to be proud ofdannysullivan has much to be proud ofdannysullivan has much to be proud ofdannysullivan has much to be proud ofdannysullivan has much to be proud ofdannysullivan has much to be proud ofdannysullivan has much to be proud of
I've renamed this thread to help make the data center name a bit clearer. Barry at Search Engine Roundtable called it the "Big Daddy" data center. That seems to have come off of this WebmasterWorld thread with lots of discussion of it:
http://www.webmasterworld.com/forum30/32409.htm

Dayo_UK over there says:

Quote:
OK - Jagger is over - long live "Big Daddy" - as named by MC for the test DC.
I take MC to be Matt Cutts but for the life of me, I can't find anywhere online where Matt's actually used those words to name this data center. Someone finds the original reference, please post. In the meantime, we roll with it.
dannysullivan is offline   Reply With Quote
Old 12-21-2005   #5
Jenstar
 
Jenstar's Avatar
 
Join Date: Jun 2004
Location: Starbucks!
Posts: 345
Jenstar is a glorious beacon of lightJenstar is a glorious beacon of lightJenstar is a glorious beacon of lightJenstar is a glorious beacon of lightJenstar is a glorious beacon of lightJenstar is a glorious beacon of light
I was wondering about this bot too. I saw the Mozilla Googlebot on a new site of mine that went live only about 12 hours earlier that had no incoming links whatsoever. It had been registered only a few days earlier, so Google knew about it either through whois data or toolbar data, as far as I can figure. The cache date in http://64.233.179.104 matches the first Mozilla Googlebot visit.

It is a blog, and this site has already shown up in the blogsearch.google.com index... and has been live only 5 days now. I had come to the conclusion this bot was related to the blogsearch somehow, because it visits daily and always grabs the feed URL as well.

But what is most interesting, is that this site has only been visited by the Mozilla Googlebot at the same IP, but not by the regular Googlebot, and it shows up in this blog search from that one bot. Which makes me wonder if this Big Daddy DC is somehow getting weighting from the Google blogsearch?

So could this index be given weighting somehow through blog search?
Jenstar is offline   Reply With Quote
Old 12-21-2005   #6
PhilC
Member
 
Join Date: Oct 2004
Location: UK
Posts: 1,657
PhilC has much to be proud ofPhilC has much to be proud ofPhilC has much to be proud ofPhilC has much to be proud ofPhilC has much to be proud ofPhilC has much to be proud ofPhilC has much to be proud ofPhilC has much to be proud ofPhilC has much to be proud of
Quote:
Originally Posted by ReSiever
I did read that. But the interesting part is, sometimes the datacenter shows the same results as other datacenters. Maybe they sometimes throw it in the regular results, just to measure user behaviour or something? Just guessing...
Even when searching on a specific DC, we don't always get the results from that DC. It's probably due to various things, including load sharing, and the need to disconnect it while some types of changes are made.

added
I just went to Matt's blog and coincidentally...

Quote:
We’re getting closer to calling for feedback on 64.233.179.104, but I probably won’t ask for reactions for another week or two. Right now that datacenter isn’t serving traffic 100% of the time as people pull it out of the rotation from time to time to tune things up under the hood

Last edited by PhilC : 12-21-2005 at 10:18 PM. Reason: addition
PhilC is offline   Reply With Quote
Old 12-27-2005   #7
SEOBrains
SEOBrains aka Bob Rains
 
Join Date: Nov 2005
Location: Boston
Posts: 14
SEOBrains is on a distinguished road
Cool Why is it called Big Daddy???

Matt Cutts named the next major Algo Big Daddy at Webmaster World PUBCON10 in Vegas. During an informal Q&A after hie coffee talk session he asked the people in room for the next Algo Name and someone called out "Big Daddy", and Matt wrote it down, so thus the name "Big Daddy" for the DC changes.
SEOBrains is offline   Reply With Quote
Old 12-29-2005   #8
ReSiever
Member
 
Join Date: Jun 2005
Posts: 27
ReSiever is on a distinguished road
Hi PhilC,

Quote:
We’re getting closer to calling for feedback on 64.233.179.104, but I probably won’t ask for reactions for another week or two. Right now that datacenter isn’t serving traffic 100% of the time as people pull it out of the rotation from time to time to tune things up under the hood
Nice catch. Matt Cutt is talking about it for a while now, but up till now he's just keeping the fire burning. Would be nice if he did some real talking haha..

Anyway, I can't test right now, cause the datacenter doesn't seem to be live now. But from what I've been seeing the following things might be true:
  • duplicate content filter does loosen up
  • Portal sites lose strength (or authority?) and their rankings decrease
  • Indexing by Mozilla Bot (more advanced bot)

Next to that, I personally haven't seen any progress with canonical or 301 problems. Maybe somebody else?
ReSiever is offline   Reply With Quote
Old 01-04-2006   #9
Brian M
Member
 
Join Date: Oct 2005
Posts: 103
Brian M is just really niceBrian M is just really niceBrian M is just really niceBrian M is just really nice
Matt Cutts talks about 301s, 302s and more!

Matt Cutts (a.k.a. GoogleGuy) is on a blog roll, and has posted some great information about Google's handling of 301 and 302 redirects, as well as canonical URLs, etc.

I highly recommend that you read his latest blog entries at:

http://www.mattcutts.com/blog/seo-ad...302-redirects/


Last edited by David Wallace : 01-04-2006 at 04:28 PM. Reason: Updated link
Brian M is offline   Reply With Quote
Old 01-04-2006   #10
dannysullivan
Editor, SearchEngineLand.com (Info, Great Columns & Daily Recap Of Search News!)
 
Join Date: May 2004
Location: Search Engine Land
Posts: 2,085
dannysullivan has much to be proud ofdannysullivan has much to be proud ofdannysullivan has much to be proud ofdannysullivan has much to be proud ofdannysullivan has much to be proud ofdannysullivan has much to be proud ofdannysullivan has much to be proud ofdannysullivan has much to be proud of
Yep, Matt's been a madman today, but there's a method to his madness. It was all part of setting things up for taking feedback on the Bigdaddy data center, which will migrate to Google in the next month or two. So expect a Feb. 2006 or March 2006 Bigdaddy Update.

Key posts, which I'd suggest reading in this order:

Bigdaddy on the move: Alert that one of the Bigdaddy data centers is back to showing regular results so fixes can be put into place. Want Bigdaddy, then go to http://66.249.93.103, where it's still live.

Feedback on Bigdaddy data center where he covers how the data center got its name, how this is an entire new infrastructure for Google web search coming online, how it will go live on "regular" Google in the next month or two, how ranking changes you may see now on regular Google are unrelated, how to send feedback about changes you see and more.

SEO advice: discussing 302 redirects on how and why Google handles permanent redirects on regular Google and new Bigdaddy-flavored Google.

SEO advice: interpreting inurl on how to use the inurl operator at Google and why the results probably don't show a hijacking issue, in case you suspect that in regular Google or Bigdaddy.

SEO advice: url canonicalization on my favorite word, how Google determines which domain to use for your listings when there are multiple options. Canonical issues are something Matt hopes Bigdaddy will improve.

By the way, for some additional background on two of the biggest problems that Bigdaddy aims to solve for Google -- hijacking and canonical issues, see these past pieces from Search Engine Watch:

Google Oct. 2005 Jagger Update Continues Into November & Hating The Term Canonical

Matt Cutts Banned On Google? And Oct. 2005 Jagger Update Winds Down

Revisiting Hijacking & Redirects: Moving To A Solution.

So far, I haven't had a chance to play with Bigdaddy, but I already have a big positive feeling from the effort Matt's put into to prepping people for it and to help them send feedback.
dannysullivan is offline   Reply With Quote
Old 01-04-2006   #11
projectphp
What The World, Needs Now, Is Love, Sweet Love
 
Join Date: Jun 2004
Location: Sydney, Australia
Posts: 449
projectphp is a splendid one to beholdprojectphp is a splendid one to beholdprojectphp is a splendid one to beholdprojectphp is a splendid one to beholdprojectphp is a splendid one to beholdprojectphp is a splendid one to beholdprojectphp is a splendid one to behold
Quote:
So far, I haven't had a chance to play with Bigdaddy...
Oh man, that is GOLD!
projectphp is offline   Reply With Quote
Old 01-05-2006   #12
ReSiever
Member
 
Join Date: Jun 2005
Posts: 27
ReSiever is on a distinguished road
Well, as I stated in another thread on here, I saw the results of the big daddy datacenter spreaded across more datacenters since 31 dec / 1 januari. This data was live for two to three days and was set back yesterday to the older data we have been viewing for the last few weeks. And with older data, I mean like a month back or so, cause I'm seeing Serps for specific keywords that I was also getting in the beginning of december.

Right now, they are tossing in the big daddy data(center) in the evening again(well, it's evening overhere )as Matt stated at his blog:

Quote:
Q: Why did you wait so long to ask for feedback?
A: There were a couple reasons. First, I knew that Bigdaddy wouldn’t go live before the holidays were over. Second, the team working on this data center wasn’t showing it 100% of the time; at night, they’d take it out of our data center rotation to tinker with it. That would have been a recipe for confusion. Now we’re past the holidays and the Bigdaddy data center is live 100% of the time. In fact, Bigdaddy is now visible at two data centers: 66.249.93.104 and 64.233.179.104
Personally I'm in datacenter 66.249.93.104 when I'm searching via www.google.nl. So, maybe they are still 'confusing' people but started to collect feedback after being live for a few days...

Next to that, I'm thrilled about the fact that Google has done something about the canonical and redirect (301, 302) problems. Matt has given us a beautiful example of that:

Quote:
So it looks like 66.249.93.104 is the best IP address to use when testing Bigdaddy. That data center should be showing Bigdaddy results more reliably. Also, it looks like [sf giants] is a fine query to see if you’re hitting Bigdaddy. If you get giants.mlb.com at #1, you’re searching Bigdaddy. If you get www.sfgiants.com at #1 and an uncrawled url http://sanfrancisco.giants.mlb.com/N...f_homepage.jsp at #3, you’re hitting the older Google infrastructure.
Next to that, the data seems to be more recent, cause the Mozilla bot is going around the web like crazy..

So far, so good !

Already had a chance to play with BigDaddy Danny? Waiting to hear your reaction overhere

Last edited by ReSiever : 01-05-2006 at 04:35 PM.
ReSiever is offline   Reply With Quote
Old 01-05-2006   #13
grnidone
Righteous Babe.
 
Join Date: Aug 2004
Location: Carrollton, TX
Posts: 138
grnidone is just really nicegrnidone is just really nicegrnidone is just really nicegrnidone is just really nice
Where do you put the feedback?
grnidone is offline   Reply With Quote
Old 01-05-2006   #14
Brian M
Member
 
Join Date: Oct 2005
Posts: 103
Brian M is just really niceBrian M is just really niceBrian M is just really niceBrian M is just really nice
Where do you put the feedback?

Hi grnidone,

Google would like to get the feedback in one of two ways:

From Matt's Blog:

"Reporting spam in the bigdaddy data center
I definitely want to hear about webspam that you see in Google. The best place to do that is to go to http://www.google.com/contact/spamreport.html . In the “Additional details:” section, I would use the keyword “bigdaddy” in your report.

Reporting other quality issues in Google’s index
Do the search that you’re interested in on 66.249.93.104 or 64.233.179.104, then click the “Dissatisfied? Help us improve” link at the bottom right of the page. Again, fill in details and use the keyword bigdaddy so that folks at Google can separate out feedback specifically about this data center."

Just remember that the DCs are in "flux" as Google makes changes. So, you may want to check this thread to make sure that the DC you choose has the "bigdaddy" changes on it before you start giving them feedback. Or, wait a few days for things to settle down...

Brian M
Brian M is offline   Reply With Quote
Old 01-06-2006   #15
claus
It is not necessary to change. Survival is not mandatory.
 
Join Date: Dec 2004
Location: Copenhagen, Denmark
Posts: 62
claus will become famous soon enough
Question Oh boy, what is this?

(Just posted this on Matt's blog - thought you all might be interested)

---------------------------------------------------------------------
What? Is bigdaddy .. I don't even have a word for it... But, as you move around the pages in the listings for a search the SERPS *changes* (!)

I'm just investigating a search for a site with some problems and the first time, the problems started at page 3 - after looking a few pages forth I returned to page three, and then the problem strated at page four. Now it's at page five - no, damn... page 14!

And before the SERPS only went to page 11 before supplementals kicked in. Matt, what is this? It seems to be alive?
---------------------------------------------------------------------

I've never seen anything like this. Either I happened to investigate a search while it was updating, live (!), or somethig else was happening there. I tell you, the SERPS moved and changed as I browsed them! No, I'm not drunk!

FYI it was a [site:example.com] search returning around 5,000 results on the DC: 66.249.93.104. A whole lot were duplicates, but as I said it was a site with problems.

Right this moment the "view supplemental results" message comes at page 22 - as I wrote above it started at page 11 (that's also where it is with Google.com, and 64.233.179.104). Double the amount of visible results while I'm surfing the SERPS?

--
Added:
Most likely it was just a coincidence, and I happened to search while new data was being added or something. Afaik, nobody else have reported this?

OTOH, if I was searching really hard for something, as opposed to troubleshooting, I sure would appreciate this behaviour. It only makes it hard to tell a client "look at page three"

Last edited by claus : 01-06-2006 at 07:41 AM.
claus is offline   Reply With Quote
Old 01-06-2006   #16
Robert_Charlton
Member
 
Join Date: Jun 2004
Location: Oakland, CA
Posts: 743
Robert_Charlton has much to be proud ofRobert_Charlton has much to be proud ofRobert_Charlton has much to be proud ofRobert_Charlton has much to be proud ofRobert_Charlton has much to be proud ofRobert_Charlton has much to be proud ofRobert_Charlton has much to be proud ofRobert_Charlton has much to be proud ofRobert_Charlton has much to be proud of
Quote:
Originally Posted by claus
...FYI it was a [site:example.com] search returning around 5,000 results on the DC: 66.249.93.104. A whole lot were duplicates, but as I said it was a site with problems.

Right this moment the "view supplemental results" message comes at page 22 - as I wrote above it started at page 11 (that's also where it is with Google.com, and 64.233.179.104). Double the amount of visible results while I'm surfing the SERPS?...
claus - This is only going to touch on one tiny aspect of your post, and probably not your main question. But you, and everyone else who didn't catch this passing comment on Matt's blog should know that 64.233.179.104 is no longer showing Bigdaddy.

To see current Bigdaddy, go to http://66.249.93.104/

Here's the thread on Matt's blog....

Bigdaddy on the move

Quote:
From Matt's blog:
Executive summary: if you want to play with a Bigdaddy data center, hit 66.249.93.104 instead of 64.233.179.104.
Longer explanation on the full post.
Robert_Charlton is offline   Reply With Quote
Old 01-06-2006   #17
Robert_Charlton
Member
 
Join Date: Jun 2004
Location: Oakland, CA
Posts: 743
Robert_Charlton has much to be proud ofRobert_Charlton has much to be proud ofRobert_Charlton has much to be proud ofRobert_Charlton has much to be proud ofRobert_Charlton has much to be proud ofRobert_Charlton has much to be proud ofRobert_Charlton has much to be proud ofRobert_Charlton has much to be proud ofRobert_Charlton has much to be proud of
Here's a quick observation about the instability on Bigdaddy, which wasn't my intention when I started writing this post...

Last night I chanced to run a search on Google, for [open source radio]. The Google search returned two pages for the site I was seeking, radioopensource.org... the default domain as #1, as well as index.php of that domain as #2. "Aha!" I thought... "canonical problem. I'll check this out on Bigdaddy," and so I did.

Last night, the Bigdaddy search also returned two pages from radioopensource.org #1 and #2, but index.asp was not #2... it was another page. I was pleased to see that Google had fixed this.

I thought I'd check that again before posting, and this evening Bigdaddy does return index.asp as #2. Maybe this an indication that the canonical fixes are in flux... or at least that results have changed in a period of about 16 hours. Of course your results may vary if you check this.

The Radio Open Source site, I should mention, is blog style, so the home page always changing, as are the caches, and I'm not even going to attempt to think about how that might be affecting this particular result. I'm also assuming that this is something that Google sees as a problem and is trying to fix.

Last edited by Robert_Charlton : 01-06-2006 at 09:31 PM.
Robert_Charlton is offline   Reply With Quote
Old 01-11-2006   #18
highpr
Member
 
Join Date: Jan 2006
Posts: 5
highpr is on a distinguished road
Question

I'm very interested in this development and have read this entire thread and the supplemental links.

For those of us who are just getting up to speed on Big Daddy - please answer me this:

1. Will it effect my search rankings?

2. Why is this change even needed? (what's the purpose here?)

Thanks in advance - great thread!
highpr is offline   Reply With Quote
Old 01-11-2006   #19
PhilC
Member
 
Join Date: Oct 2004
Location: UK
Posts: 1,657
PhilC has much to be proud ofPhilC has much to be proud ofPhilC has much to be proud ofPhilC has much to be proud ofPhilC has much to be proud ofPhilC has much to be proud ofPhilC has much to be proud ofPhilC has much to be proud ofPhilC has much to be proud of
Yes it will affect rankings. All changes affect rankings - both for better and for worse - some go up, and some go down.

The changes are necessary because search engines continually try to improve. The Big Daddy changes are more fundamental than the fixes and filters that most updates consist of.
PhilC is offline   Reply With Quote
Old 01-12-2006   #20
ReSiever
Member
 
Join Date: Jun 2005
Posts: 27
ReSiever is on a distinguished road
seems to be very quite around this subject at the moment. I'm expecting Google to be looking through all the feedback they got after being live for a few days...
ReSiever is offline   Reply With Quote
Reply


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off