|
#1
|
|||
|
|||
|
Level of Trust for Matt Cutts' Sandbox Explanation @ SES NYC
I'm hoping to get some discussion on what I and several others "read between the lines" based on what Matt Cutts' (and Craig Manning) had to say about the "sandbox effect". It was my impression that both suggested that Google watches for sites that fall outside the norms in their sector for link building and optimization efforts and will "hold back" those sites until they can be manually reviewed to lift the penalty.
Matt mentioned looking specifically at sites like ChristopherReeve.org & Tsunami.Blogspot.com - do you think there's any validity to his statements? How much or little do you think this has to do with what we in the industry call "sandbox"? I appreciate your input as I don't have much experience with how much or little to trust these guys. |
|
#2
|
|||
|
|||
|
I've heard it said for years that links too fast will "raise a flag".
However, I find it hard to imagine that a technology-focussed company like Google is going to assign issues of relevancy on this scale entirely to human review. I'm curious what information you are referring to on sandboxing? |
|
#3
|
||||
|
||||
|
In stead of reading between the lines, can anyone actually quote Matt and Craig on what was said?
|
|
#4
|
|||
|
|||
|
I'd like to know that too.
Did you post your "reading between the lines" or what was actually said? A quick thought on it:- For several months, people were reporting that no new sites were added to the index after a certain date last year - the start of the sandbox. That doesn't equate with auto-identifying and holding back oddities, but, then, I don't know if your posted understanding is "between the lines" or not. Last edited by PhilC : 03-06-2005 at 01:23 PM. |
|
#5
|
|||
|
|||
|
OK, I'm going to do my best to quote both Matt & Craig from the different sessions, and not the reading between the lines, but the actual text.
Matt: "If a new site is getting 1,000 links a day, it's going to look suspicious. Don't get me wrong, there are sites that deserve them like ChristopherReeve.org and Tsunami.Blogspot.com, but we look and see if those sites deserve their links." Craig: When someone asks about being able to filter out spam and spot "outliers" and the computational expensiveness of that process, Craig says "We have lots and lots of computers". Matt (in a later session): While talking about building too many links too fast says "Who here can afford to lose all their rankings for all their sites?" Only Eric Ward & Greg Boser raise their hands (pretty funny). Greg (during same session): (Talking about sites that build links too fast)I call it the "litter box" not the "sandbox" and it takes forever because these guys (looks at Matt) are gonna check out the site (Matt nods and smiles in a way that makes it seem to me like Greg & Matt have discussed this before). I know this isn't a lot of evidence, but I'm trying to look for some nugget of truth regarding the phenomenon of sites that rank #1 at every engine but Google (where they're #550). This explanation makes a lot of sense, except for the fact that so many were "released" on Feb. 2. If they were getting manually reviewed, you'd think they'd be "released" as they were checked out, not all together in big clumps... |
|
#6
|
||||
|
||||
|
Okay then.
So now we can better read between the lines with a little more information. Let me just ask you this, are all websites that fall into this so called sanbox getting 1,000s of links per day? I don't think so. As to the trust thing. Yes I do trust what Matt and Craig say, always. It's what they don't say that's always the mistery which opens up the bounderies of possibilites. ![]() |
|
#7
|
|||
|
|||
|
Quote:
Googlebot can only find links on the pages it parses. What if it takes several weeks for gbot to find and index the 10,000 pages that have the 10,000 links pointing to the page in question? Quote:
![]() |
|
#8
|
|||
|
|||
|
When I first set up my personal domain (to replace my personal homepage on Xenite), I only linked to it from my own sites. I did not pursue links from other sites. It only included a few content pages, but they were full of content.
The site remained in the sandbox for a very long time. I eventually updated the content, and noticed a number of other sites were now linking to it. About that time, the site started to appear in Google, but still doesn't rank very well for my name (I doubt it ever will, but we'll see). So, that is an example of a site which did NOT get thousands of links (much less a thousand per day). But it is still an example of a site which did not generate a lot of unique inbound links. |
|
#9
|
||||
|
||||
|
Moderation Note: This thread is based on what Matt Cutts' (and Craig Manning) had to say about the "sandbox effect". It is NOT about providing examples of sandbox sites. Let's stick to this topic please.
|
|
#10
|
|||
|
|||
|
Quote:
Quote:
I don't have enough information to decide whether I trust what they have reportedly said. |
|
#11
|
|||
|
|||
|
I really don't think it's as simple as monitoring whether a site is getting too many links or not - my personal suspicion is that sandboxing is in majority an automated process, relying on a number of factors.
Somehow age of domain registration seem involved, but also the search frequency of the keywords involved looks to myself like it could be a factor. There's room to additionally argue that reduced number of link variations may also play some role in sandboxing. EDIT: I'd posted an example of research here showing differences between a couple of sites, but the mods seem very twitchy today so I've removed it. Last edited by I, Brian : 03-06-2005 at 05:11 PM. Reason: Removed link to own research |
|
#12
|
|||
|
|||
|
I agree. I think Google introduced some sort of aging factor (in fact, I have written extensively about this on another forum). Maybe they have since decided to dispense with the aging factor, perhaps because they found that too many innocent sites were being dumped into the sandbox.
If that is the case, then the reported statements from the conference won't shed much light on what Google has actually been doing, since they don't offer much information regarding the extent of Google's efforts, or the duration. These comments are really being provided out of context. We would need to see a transcript for all the sessions concerned. And even then, questions might arise for us that were not asked. |
|
#13
|
|||
|
|||
|
Quote:
I don't find what those guys said matches the experiences that people have reported since the so-called sandbox came into effect. If what they said is correct, then it must surely be only a part of the truth about the sandbox. About the reason why so many were released recently:- if they really are evaluating all the sandboxed sites by hand, which is very hard to believe because of the sheer quantity of sites, then they may have decided to clear the decks and start again due to a massive build up. Last edited by PhilC : 03-06-2005 at 06:45 PM. |
|
#14
|
|||
|
|||
|
Phil - That actually sounds fairly logical to me. Brian, maybe you could PM me your research. I'd be interested to see it if you can't share it on the board.
I was thinking about the quantity of sites each day that "stand out" from the norm and seem to be getting many more backlinks than usual. Certainly they are automatically "caught" by the indexing system, but the question is how possible it would be for a human (or several) to actually look at the outliers each day. There's no question this would improve search relevancy in my mind, the only question is - is it possible? Let me make up some figures - let's say each day there are 3000 sites that get "tagged" by the indexer as looking fishy. If a human being needs to look at each one, I'd estimate no more than 5 minutes per site, which would be 250 man hours each day - that would require a team of at least 30 people doing nothing but reviewing sites all day long... seems possible but unlikely. If the indexer were only catching 1000 sites each day, I could easily see a ten-man team set up to watch for this type of activity. Anyone else's opinion? |
|
#15
|
||||
|
||||
|
I had a five minute discussion with Matt on this topic by the speaker room.
I brought up a classic example of a company that specializes in building "insect rearing rooms." There is zero competition for this company, they wrote long, detailed information on how to build such a room. They are simply the authority on the Web for this topic. So he said, let me take a look at the site. And he will, I should hear back by mid this week. (I have never sent a specific example of a site to any of the engines before this one, and I probably never will again.) But then I popped the question about the domain registrar topic. Why did Google become a registrar? Matt told me, it is not to register domain names. We then moved to the topic of how its important for Google to look at the "freshness" of a page and the age of a domain name. Bingo? I don't know. I did not get a direct answer from Matt about the sandbox. One thing is for sure, Google does not use that term inside the GooglePlex. I'll let you know what I can about the results from my example site, sent to Matt. I am not expecting much but we will see. Of course, Matt gave examples of sites that were not affected by this. But so many are. |
|
#16
|
|||
|
|||
|
I was never convinced "sandbox" existed. It didn't seem plausible for the task of retrieving relevant results.
|
|
#17
|
|||
|
|||
|
Quote:
Rand's quotes pretty much match what I wrote in my notes (I got a big kick out of the "lots and lots of computers" answer.) It's tempting to try to read more into some of those statements than may have been there, and a session on linkbuilding seemed like a reasonable place for someone to ask a question about a sandbox effect. But, going back over them in my head, and the contexts under which they were uttered, I think that they would have given the same answers, and responses regardless of whether or not there was a sandbox. I'm skeptical about the existence of a "sandbox" in the manner that many portray it, but I was willing to sit there are listen and see if I would hear an admission of some type that something of the sort existed. I just don't feel that there was enough there to draw any conclusions upon. |
|
#18
|
|||
|
|||
|
Quote:
|
|
#19
|
|||
|
|||
|
Just to clarify what I personally mean when I say "sandbox", I mean a site that ranks top 5 at Yahoo!, MSN, Teoma, for the allin searches @ Google and yet appears at number 100+ in the Google SERPs. Certainly Google has their own unique algorithm, but typically an experienced professional (or an IR researcher like yourself, Xan) can look at the site in particular and the rankings and conclude that something is "funny".
Bill, I have to agree that they probably would have said the same thing, sandbox or no, but just because we use a different term (Greg called it the "litter box") doesn't mean they don't know what we're talking about. It's certainly a relatively new phenomenon (about 1 year old now) and had never been seen prior to about March of 2004. |
|
#20
|
|||
|
|||
|
Quote:
Google is far more advanced in the way they look at anchor text and linking structure than any of the other engines and Google also has different filters in place than the other engines. Either one of these alone could well explain the ranking discrepancy. I don’t believe that there is or ever was a such a thing as a “sandbox”. "Sandbox" = Someone trying to explain something they don't understand. I have looked at a lot of so called "sandboxed" pages, in every case there was a completely rational and logical reason the page was not ranking, usually something rather obvious that was overlooked. |
![]() |
| Currently Active Users Viewing This Thread: 1 (0 members and 1 guests) | |
| Thread Tools | |
|
|