Search Engine Watch
SEO News

Go Back   Search Engine Watch Forums > Search Engines & Directories > Google > Google Web Search
FAQ Members List Calendar Forum Search Today's Posts Mark Forums Read

Reply
 
Thread Tools
Old 01-03-2006   #81
andrewgoodman
 
andrewgoodman's Avatar
 
Join Date: Jun 2004
Location: Toronto
Posts: 637
andrewgoodman is a name known to allandrewgoodman is a name known to allandrewgoodman is a name known to allandrewgoodman is a name known to allandrewgoodman is a name known to allandrewgoodman is a name known to all
"Is", "Isn't", or "Always Was"

Sure, fair enough.

But while we're on this topic, could you explain to me how any brand new site is going to rank well for the phrase "home furnishings" -- "sandbox" or no "sandbox"? You would have to build up relevant linkage and other indicators of a page's meaning & status before you could rank at all on those kinds of phrases. So the page's score would be so low that it would be zero or very near zero, and not worth displaying at all. I'm thinking as long as Google's algo has been sophisticated enough to filter out the worst kinds of link spam and assess behavioral/quality indicators, there would have been a sandbox-like effect on competitive phrases.

Today, this site [we were talking about homestars.ca by the way, but it started life as homedirection.ca] appears #1 & #3 on a designer's name, "hildi weiman," etc. In this case you can see that the #1 listing is of the old site, so both sites are ranking on the phrase... which makes this whole site a bad example to use, because it'll be awhile before Google figures out that homestars and not homedirection is the real site. I agree that's not a popular phrase, but...

We seem now to be defining the sandbox effect as "not getting high listings on very competitive phrases." But isn't the point of assessing the linking structure of the web one that would have inherently involved a sandbox effect for engines like Google and Teoma, so this current situation is more of a continuation/extension of something that always existed?

I wrote a blurb in October 1999 - http://www.traffick.com/story.asp?StoryID=29 - about Google, pointing to an argument that was emerging at the time against Google and PageRank:

"Google's reliance on an automated measure of 'reputation' may magnify the popularity of the biggest, most popular sites, and make it difficult for newer, high quality sites to be discovered." and "A major issue may be ‘lag time’ or inertia. Older, more established sites may fare better, and this can become a vicious circle. Some now-obscure pages buried deep in a major website's archives may rank too high."

On competitive phrases, hasn't it long been the case that SE's won't just rank new sites out of the blue on popular terms?

How can anyone point with certainty to the "day" the "sandbox" was "invented"? Probably because it never was, but what seems like a sandbox effect has ebbed and flowed as the technology has evolved. You could argue that editorial review (dmoz, etc.) is a "sandbox" as well. Editorially, sites and pages need to be "accepted" and gain some kind of confidence score higher than "infinitesimal" before they're going to be featured on a search engine. For a new site, they don't even have basic site-specific info for how often it updates. That data takes time to gather. Who expects to rank on a term like "home furnishings toronto" overnight?

I checked the registration dates for the current owners of the sites in the top ten listings on that particular query. They are:

Sep. 1996
Jan. 2001
Feb. 1997
Jan. 1999
Feb. 1996
[Google Directory category]
Mar. 1999
Apr. 1998
Jan. 1998
Jan. 2003

--
next 10:

May 1996
Aug 2000
Jan. 2003
[already mentioned]
[dealtime]
[already mentioned]
Aug. 1994
Nov. 2000
[already mentioned]
[already mentioned]

--
next 10:

Oct. 1999
Feb. 2000
Oct. 1994
May 1997
Feb. 2005 [page on romanian adult industry discussion site / redirect to a furniture company page / thus spam ] [so we get to the 25th result before freshness trumps reliability, and smack, a spam page is the result - until here, the youngest site is three years old]
Oct. 1995
[already mentioned]
May 2003
Nov. 1997
Mar. 2000

--
next 10:

[already mentioned]
[already mentioned]
[craigslist]
Nov. 2002
Mar. 1996
Jul. 2001
Aug. 2003 [spammy, broken, irrelevant, India]
[yahoo directory]
[yellowpages.ca]
Sep. 2000

Some sandbox! If you were telling a client how long you'd have to wait before having a shot at being ranked in the top 40 on a moderately popular term like "home furnishings toronto," (and of course you get virtually no clicks outside of the top 10 anyway), you'd have to tell them THREE YEARS!! (Unless they have special Romanian or Indian spam techniques up their sleeve, in which case they'd make it to #25 or #37 and get no clicks anyway.) Or if you wanted to give them the average or median age of site in those positions: more like 4-6 years.

Again I say: that's some sandbox!

But maybe we should be going for something more retail-practical, like a certain type of chair [recliners, for example]. Looking at one such query I see sites, including a client's site, in top ten positions with a fairly similar pattern as far as domain age goes: they are all old, having been registered in years like 1997, 1995, 2000, etc. Either that or they are portal sites like bizrate or "knowns" like Google Answers. Again, quite a sandbox!

Although I was too lazy to check beyond a few of them, I also assume that *all* of the above (save for the spam entry that snuck in, possibly because Google was having trouble doing its automated checking on foreign domains??) have a stable, long-standing pattern of inlinks from sites with high confidence. As one of the SE reps said in Chicago, a small retailer just doesn't get 1,500 links all of a sudden, spontaneously. Most would be happy to have 1,500 customers.

Both the age of the sites, and the continued importance of some types of links, underscores the limitations of the search technology. There is no particular value to this link, for example:

http://www.ctv.ca/servlet/ArticleNew...5034_100369888

Except that it's a pretty sweet link from a national television station to a major furniture retailer. In short, a kind of "crony system." Little wonder then that companies will try to recreate spammy versions of same. To weed these schemes out, further/stronger filters and double-checks of quality are used, and that makes the pre-existing "sandbox effect" or "crony system" stronger (but if it's relaxed, spammy stuff comes right back in, so it can't be relaxed too much).

I guess a lot of it is about who you know, as always, huh?

Just playing devil's advocate. If a site is brand spanking new, I'm not sure how any of its pages could have enough PageRank (or other link recognition, or quality indicators) to outrank established pages/sites on core, popular terms. Based on what? By definition, such new sites come in tabula rasa (if they don't, please explain to me how they don't) and are de facto assumed to be spam until they prove otherwise. Guilty until proven innocent.

If a new site has 4-5 quality new links, maybe it should rank, but I see why it won't. Mainly because link schemes are so prevalent and SEO's have been sitting around aging domains and buying them and so forth, I'm sure there is some kind of waiting period in order to gain high enough confidence that a site isn't seen as "suspect."

Could something (evidence of major unmistakeable user interest in a new site) override that waiting period? I suspect so, but presumably there has always been a waiting period of at least 60-90 days before being decently indexed. Just because it seems longer now doesn't mean that's the sandbox length or anything like that. I don't know if "sandbox length" even makes any sense. Perhaps site history with organic is similar to how account history is measured in the AdWords algo now: undisclosed, but likely on a continuum. There is no set waiting period - you simply build a history (Ian said it -- reliability/confidence increases with more data).

Another thing I notice is that a few rather spammy (link farm driven) listings still do well on the phrase "home furnishings toronto." (I won't say which ones are the cheesiest as you're not supposed to out people on the forums.) No doubt, those listings will eventually be gone. But it looks like the reason they'll be dropping may have everything to do with spam reports by users and competitors. The top 20 listings in popular categories eventually come to the attention of the engines, and eventually the ones getting too much traffic by virtue of deliberate interlinking will get penalized based on judgment, not algorithms. Honestly, all you really have to do in a lot of cases is to look at the inlinks, then glance at the sites involved. So - human filtering is happening. There is so much going on behind the scenes, it's not funny.

If a site that got registered five years ago gets penalized for link farming, that could mean that over time, the average age of top-ranking sites on valuable phrases gets EVEN OLDER... at least until there's a user backlash and users decide they prefer freshness & diversity at the cost of at least some spam.

So are all new sites seen as suspect in a spam-ridden world? Yes, it seems, when it comes to ranking on popular, lucrative, high volume phrases, as long as users and competitors alike scream about spam.

Is this new? No, I don't think so.

Is it good for searchers? Not really. It would be better if SE's could understand what is really relevant to a user instead of relying so much on "fail-safe" methods like giving so much credence to domain age and stability/age/reliability/relevance of linkage. Someday they'll be better at personalization etc.

Can you explain the sandbox rules, or how long it might take, to rank well on a popular term? I doubt it!

Anyway, to sum up, isn't the idea of a sandbox on core popular terms built right into what the current generations of SE's actually measure, which is reputation, etc.? Is that not business as usual?

I suppose the problem with trying to suss out just exactly what the so-called sandbox is, its rules & parameters can change without notice, and exceptions can prove & disprove "rules" all over the place.

Last edited by andrewgoodman : 01-03-2006 at 08:08 PM.
andrewgoodman is offline   Reply With Quote
Old 01-03-2006   #82
Jill Whalen
SEO Consulting
 
Join Date: Jul 2004
Posts: 650
Jill Whalen is just really niceJill Whalen is just really niceJill Whalen is just really niceJill Whalen is just really niceJill Whalen is just really nice
Quote:
how any brand new site is going to rank well for the phrase "home furnishings"
They're not.

But there is some middle ground between a phrase like the one you previously mentioned which basically gets no searches, and a phrase like "home furnishings."

It's those middle ground phrases that the aging delay eats for breakfast.

When you do searches for phrases that appear in the title tag of a page, yet those words happen to be on some other people's pages, but your relevant page shows after every single one, you will know what the aging delay feels like. I can't stress enough how completely different it is to the usual "takes awhile to rank" phenonmenon that we all know and love.

Quote:
Can you explain the sandbox rules, or how long it might take, to rank well on a popular term? I doubt it!
I haven't dealt with it enough to say for sure, but the sites I've seen there are no rules. It's simply wait 9 months (approx) and bang your out regardless of anything else you do.

Last edited by Jill Whalen : 01-03-2006 at 09:24 PM.
Jill Whalen is offline   Reply With Quote
Old 01-03-2006   #83
dazzlindonna
Internet Entrepreneur
 
Join Date: Jan 2005
Location: Franklinton, LA, USA
Posts: 91
dazzlindonna is a glorious beacon of lightdazzlindonna is a glorious beacon of lightdazzlindonna is a glorious beacon of lightdazzlindonna is a glorious beacon of lightdazzlindonna is a glorious beacon of light
Quote:
On competitive phrases, hasn't it long been the case that SE's won't just rank new sites out of the blue on popular terms?
Not necessarily. I know myself and many other SEOs could fairly easily rank for competitive phrases within 30 days (back in 2003 and early 2004) with a brand new site - of course with lots of links being thrown at the site. The average site created by John Doe...well, yeah, it probably took him a while.

Quote:
How can anyone point with certainty to the "day" the "sandbox" was "invented"?
Not the "day", but certainly a narrow range of time. Many SEOs began noticing the change in March/April of 2004, myself included. Between Jan. and March, I launched several new sites...30 days to the top. March/April and onwards...no SERPs love anymore. At that time, we all started taking a long hard look at what was going on, and it was obvious that something major had changed.
dazzlindonna is offline   Reply With Quote
Old 01-03-2006   #84
Marcia
 
Marcia's Avatar
 
Join Date: Jun 2004
Location: Los Angeles, CA
Posts: 5,476
Marcia has a reputation beyond reputeMarcia has a reputation beyond reputeMarcia has a reputation beyond reputeMarcia has a reputation beyond reputeMarcia has a reputation beyond reputeMarcia has a reputation beyond reputeMarcia has a reputation beyond reputeMarcia has a reputation beyond reputeMarcia has a reputation beyond reputeMarcia has a reputation beyond reputeMarcia has a reputation beyond repute
Quote:
I have, however, seen long-established sites start ranking poorly for terms which they formerly held great rankings. This has always been in conjunction with a redesign effort or site architectural changes.
That's because canonicals and site structures are being looked at, which *may* enter into the collection of factors that make up the delay, but a site that drops out of rankings because of a structural change has nothing to do with being sandboxed - it's something else.

Quote:
Can you explain the sandbox rules, or how long it might take, to rank well on a popular term? I doubt it!
There can be no rules, because it isn't a "thing" - it's a set of qualities and values looked for, combined with a set of filters. It isn't just one thing - it's a combination of things all put together.

Some sites will never come out of it because of continuing to run into certain filters or not accruing enough of what it takes to rank for given search terms. At some point, that's no longer the "sandbox delay" for those sites, they just don't qualify to rank or have something wrong that's preventing ranking.
Marcia is offline   Reply With Quote
Old 01-04-2006   #85
mcanerin
 
mcanerin's Avatar
 
Join Date: Jun 2004
Location: Calgary, Alberta, Canada
Posts: 1,564
mcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond repute
Quote:
There can be no rules, because it isn't a "thing" - it's a set of qualities and values looked for, combined with a set of filters. It isn't just one thing - it's a combination of things all put together.
Exactly!

It's like asking what causes death. The list of possibilities and causes is almost endless and therefore the question can't really be answered as asked, but that doesn't mean there is no such thing as death as a result.

It just means that it's the WRONG question.

Any question involving the word "sandbox" is probably badly worded and therefore unanswerable, IMO.

That doesn't mean that the effect isn't real - it means that by using the term "sandbox" you have limited yourself already in the type of answer - it's an inherently biased question because it assumes that the definition of "sandbox" is fixed or even can be limited to a single set of circumstances.

The final effect on the other hand, like death, is pretty unmistakable, if you know what to look for.

Ian
__________________
International SEO
mcanerin is offline   Reply With Quote
Old 01-04-2006   #86
I, Brian
Whitehat on...Whitehat off...Whitehat on...Whitehat off...
 
Join Date: Jun 2004
Location: Scotland
Posts: 940
I, Brian is a glorious beacon of lightI, Brian is a glorious beacon of lightI, Brian is a glorious beacon of lightI, Brian is a glorious beacon of lightI, Brian is a glorious beacon of light
Quote:
Originally Posted by andrewgoodman
But while we're on this topic, could you explain to me how any brand new site is going to rank well for the phrase "home furnishings" -- "sandbox" or no "sandbox"?
It's not simply "brand new" but simply "newer" - which can mean a domain even a few years old. If you link build for a pre-2000 and post-2000 domain, the difference in ranking ability is significant, even when the link record for both sites are pretty much the same.

Quote:
Originally Posted by andrewgoodman
I suppose the problem with trying to suss out just exactly what the so-called sandbox is, its rules & parameters can change without notice, and exceptions can prove & disprove "rules" all over the place.
Certainly it's a concept that has become more developed at Google. For a brief history of how the term entered the SEO language, see this:
http://www.platinax.co.uk/blogs/bria...early-history/

Nacho posted a good list of links discussing the issue at SEW a while back:
http://forums.searchenginewatch.com/...ead.php?t=1917
I, Brian is offline   Reply With Quote
Old 01-04-2006   #87
claus
It is not necessary to change. Survival is not mandatory.
 
Join Date: Dec 2004
Location: Copenhagen, Denmark
Posts: 62
claus will become famous soon enough
Quote:
Originally Posted by Marcia
>>First you do the reasoning, then you build the algo.

claus, that may be the initial sequence, but then the algo itself has no reasoning capability, it's just a set of programs.
Agree. Well, I don't even need to agree, it's as obvious as daylight

Quote:
Originally Posted by Marcia
Stanford's criteria for credibility is fine and good, but it basically takes most all human judgment and reasoning and there's precious little in there that an algo can determine. Last updated, sure - but can they detect if it's any indication of value? nope.
That's not really an unusual situation. It's a fairly "classic" kind of problem in Marketing Research (or in any other type of research for that matter, I believe). You want to measure something, but the thing you really want to measure just don't lend itself to measuring.

So, you have to measure something else in stead. Things that have some kind of connection to the items that you really want to track. Approximations.

I did specifically say that that article was *not* the easy fix. I did hope that it would be an inspiration, though.

Quote:
Originally Posted by Marcia
There is virtually nothing mentioned in any search patents or white papers that has anything to do with companies and their credibility or quality or their marketing expertise, as such, not in the sense that human judgment can determine by observation and reasoning.
Again I don't even need to agree as it's obvious. The keyword from above is "approximation". You can't make a computer perform human judgment, so you make it do something it can in stead - only, you try to make sure that what the computer does is somehow related to what you would really want it to do (if only it could).

Quote:
Originally Posted by Marcia
That's where the link-based algos came in, PageRank and HITS - trying to use links as measures of importance and reputation.
Exactly. That's hitting the nail right on the head. It's "measures of" and not the real thing itself.

Quote:
Originally Posted by Marcia
And those can be and *are* tampered with, or in some cases, the deck could be stacked in the first place, as in the case of long-established Fortune 500's which get around filters that result in the "sandbox effect".
You know, lists... *sigh* I often build and maintain various types of lists. Quite elaborate lists, usually.

From an Engineers perspective (I know a few) they're a PITA. You forget things that should have been on them, you include too much, you put something on them and then things change and they shouldn't be there again. Eg. a F500 list would only be the real list once a year when it's published. And then you've got all the Enrons of this world, too - as well as companies that move from serving one market to serving another, splitting up, reorganizing, merging, changing names, and buying/selling. And then there are errors.

Name any kind of list - as long as it gets big enough it's not a list anymore, it's a jungle.

But of course, in the Brick-and-mortar world there are companies specializing in making and maintaining lists of real businesses, as well as those that exist only on paper. So, I guess you could outsource that. I'm not saying I agree with you, and not that I disagree either.

I only find "stacked decks" as such a plain stupid thing to do, as the flexibility that Google needs would require a lot of manpower, and their usual power preference is electrical. Then again, perhaps they're being a bit stupid - it would not be the first time. Even with a high number of PhD's on the payroll they sometimes try out things that they haven't really got enough prior experience or knowledge about, and sometimes to an outsider some of those things look quite stupid.

Anyway, to cut them some slack:

What I find more likely is the thought that some of these "F500" have some properties that the other ones just don't have. IOW they're both out in the exact same rain, the F500s have just got a bigger umbrella. Or, they're on the exact same road, the F500s have just got more horsepower.


Quote:
Originally Posted by Marcia
But that isn't by reasoning or judgment, it's by metrics that are mathematically measurable.

There's a *group* of filters operating
Okay, let me be a bit provocative...

Q: What's new here?

A: Nothing, really. It's the same as it ever was. Google just got smarter that's all.

Quote:
Originally Posted by Marcia
and people who don't DO SEO don't comprehend it, and those who work with Fortune 500's and the like won't experience it because they're working with a deck that's stacked in their favor to begin with, that's got the ability to by-pass and overcome the effects of some of the key filters to begin with.
I could repeat the Q and A here. However that's not very productive. There are always differences in perspective. However, I don't deny the existence of those issues that you all refer to as "sandbox" issues. Not at all. But I *still* don't speak of a sandbox if I can help it. I maintain that the term is wrong and misleading.

Even "rain" does not equal "wet" - "no umbrella" plus "rain" is a bit closer.

I think it's more appropriate and fruitful to think in terms of "Survival of the fittest" than in term of sandboxes. <tongue-in-cheek> It is "organic" SERPs after all (sorry about the pun, couldn't help it ) </tongue-in-cheek>

And those that are fittest in some contexts will be the established sites, while in other contexts it will be new-ish sites. (And of course, by "X sites" I mean "pages on X sites").

So, to turn the attention to something productive again, think about your typical plant. How would a new plant of any particular type get a slice of the precious sunlight? Those "signals of quality" I mention are what makes the sun shine on a smaller or larger part of your plant in stead of the other plants. And of course, the old and big plants tend to overshadow the new ones. No wonder.

So, let's say that you have a sun that favour the plants that have the highest likelihood of becoming nutritional ingredients in a salad. That might turn out to be the plants that already get some sunlight. Yes, of course it's skewed. No, of course all animals on the farm are not equal.

Hope you get my point now

</rant>

Last edited by claus : 01-04-2006 at 11:56 AM.
claus is offline   Reply With Quote
Old 01-04-2006   #88
Marcia
 
Marcia's Avatar
 
Join Date: Jun 2004
Location: Los Angeles, CA
Posts: 5,476
Marcia has a reputation beyond reputeMarcia has a reputation beyond reputeMarcia has a reputation beyond reputeMarcia has a reputation beyond reputeMarcia has a reputation beyond reputeMarcia has a reputation beyond reputeMarcia has a reputation beyond reputeMarcia has a reputation beyond reputeMarcia has a reputation beyond reputeMarcia has a reputation beyond reputeMarcia has a reputation beyond repute
claus, did you read this whole thread? Scroll back up and read msg #73.
Marcia is offline   Reply With Quote
Old 01-04-2006   #89
claus
It is not necessary to change. Survival is not mandatory.
 
Join Date: Dec 2004
Location: Copenhagen, Denmark
Posts: 62
claus will become famous soon enough
Yes I read it... I don't understand, perhaps I missed something? I just read that post a second time, still don't get it, I'm sorry - what did I miss? Was your post partially a response to #73 and not mine, is that it? If so, I'm sorry I didn't get it.
claus is offline   Reply With Quote
Old 01-09-2006   #90
andrewgoodman
 
andrewgoodman's Avatar
 
Join Date: Jun 2004
Location: Toronto
Posts: 637
andrewgoodman is a name known to allandrewgoodman is a name known to allandrewgoodman is a name known to allandrewgoodman is a name known to allandrewgoodman is a name known to allandrewgoodman is a name known to all
Pesky agglomeration of granules revisited

I want to thank Jill in particular for explaining the "sandbox-like effect" to me so patiently. I don't think it hurts to ask "stupid questions," though. Because these forums tend to get rather self-referential and before you know it, some post someone made in October is required reading even though it was only hints and guesses.

So, although I can certainly see the existence of a sandbox-like effect, I do also hope I offered a bit of food for thought.

claus's statement:

Q: What's new here?

A: Nothing, really. It's the same as it ever was. Google just got smarter that's all.

...was probably closest to what I was trying to get at.

Considering all the junk that gets thrown so aggressively at the engines, it's a good thing for users that these new pages do get "sandboxed".

What has to happen in the future, though, is that Google has to get *even* smarter. The sandboxy treatment of newer pages is a pretty blunt instrument. It raises real questions about the moat-like divide between older and newer sites/pages. Can you keep extending your lead on newer sites, all else being equal, if you have "tenure" and "history"? If you have to wait up to a year to gain decent traction on SE's, then you might be in pretty shaky shape by the time you "come out." And that in turn will make it hard to crack the top rankings, etc.

But if Google tries to get *even* smarter to *validate* sites in some way, then what form does that take? Clearly, they are thinking about that on several fronts. They have a verification system for the Local listings product; they have editors for AdWords and News; they have SiteMaps; etc.

So in the future Google seems poised to consider forms of paid inclusion or at least "trusted inclusion"; or to introduce further editorial intervention (or more weight on editorial gatekeepers) that they don't admit is editorial at all.

I think we do need to be asking more specific questions here, trying to isolate *what* needs to be older to help you expedite the exit. Domain? Pages known to Google? Links? Business registration date? Other? A combination of things? It may well be that the sandbox-like treatment of new pages & sites is in itself, in a kind of infancy. And will soon become more sophisticated, so the "effect" is felt very differently by different new sites & businesses.

Last edited by andrewgoodman : 01-09-2006 at 07:46 PM. Reason: spelling error
andrewgoodman is offline   Reply With Quote
Old 01-09-2006   #91
PhilC
Member
 
Join Date: Oct 2004
Location: UK
Posts: 1,657
PhilC has much to be proud ofPhilC has much to be proud ofPhilC has much to be proud ofPhilC has much to be proud ofPhilC has much to be proud ofPhilC has much to be proud ofPhilC has much to be proud ofPhilC has much to be proud ofPhilC has much to be proud of
This is a fantastic thread!

It seems to me that there isn't any major difference of opinions as to whether or not the sandbox effect exists. Andrew Goodman suggests that it's just a development of the age-old delay in getting rankings for decent searchterms, but he does seem to accept that there is a change to the age-old delay. The other side says the same thing, except that it's not merely a development of the age-old delay, but an intentional thing by Google.

Certainly there was a specific time period when the sandbox effect was realised, as dazzlindonna pointed out, so either a new sandbox effect started then, or a development of the age-old delay came into play then. Either way both Andrew's view, and the other view, amount to the same thing - there is a sandbox-like effect, which can be simply called "the sandbox". The only real difference is how it came about, but that doesn't matter.

I rarely create new sites, and I've no personal experience of the sandbox, but I'd like to suggest something that occured to me whilst reading this thread...

It's almost unanimous that long tail terms aren't affected by the sandbox, and it's the more popular terms that are affected. The thinking seems to be that it's the searchterms that make the difference. But how about this for an alternative possibility:-

The reason for the difference between searchterms is not because the more popular ones are listed in some way, but it's the site's/page's confidence score that determines it all. So when Google can get a large enough results set from pages that have a good confidence score, they show them. But when they can't get a large enough results set, they include pages that don't have a high confidence score - just like they do with pages in the Supplemental index. Since popular searchterms are targeted by many sites, there is no problem in getting a large enough results set without needing to include low confidence pages.

I've never liked the idea of a search engine having a list of searchterms for special treatment. It's come up a number of times in the past, and it just seems unGoogle-like to me. I seriously like the 'confidence' idea that's been put forward as what the sandbox is about, and, for me, the size of the results set is a much more pallatable idea than an arbitrary list of popular searchterms.

Last edited by PhilC : 01-09-2006 at 08:32 PM.
PhilC is offline   Reply With Quote
Old 01-09-2006   #92
Marcia
 
Marcia's Avatar
 
Join Date: Jun 2004
Location: Los Angeles, CA
Posts: 5,476
Marcia has a reputation beyond reputeMarcia has a reputation beyond reputeMarcia has a reputation beyond reputeMarcia has a reputation beyond reputeMarcia has a reputation beyond reputeMarcia has a reputation beyond reputeMarcia has a reputation beyond reputeMarcia has a reputation beyond reputeMarcia has a reputation beyond reputeMarcia has a reputation beyond reputeMarcia has a reputation beyond repute
We can also ask why some sites go up and are never sandboxed at all - which some aren't. If it were strictly an age thing that wouldn't happen; it isn't that simple.

It's a collection of algo requirements and filters that result in the "sandbox effect" for most new sites, but obviously, some sites don't get sandboxed, so those must pass muster in spite of their age. So there have to be factors or indicators that over-ride the filters and the age factor and allow some sites to rank.
Marcia is offline   Reply With Quote
Old 01-09-2006   #93
PhilC
Member
 
Join Date: Oct 2004
Location: UK
Posts: 1,657
PhilC has much to be proud ofPhilC has much to be proud ofPhilC has much to be proud ofPhilC has much to be proud ofPhilC has much to be proud ofPhilC has much to be proud ofPhilC has much to be proud ofPhilC has much to be proud ofPhilC has much to be proud of
Quote:
Originally Posted by Marcia
We can also ask why some sites go up and are never sandboxed at all - which some aren't. If it were strictly an age thing that wouldn't happen; it isn't that simple.
Is that a response to my post, Marcia? If it is, I didn't suggest anything about how the confidence score is arrived at. I only suggested that it might not be the searchterms themselves that decide whether or not a low confidence page is listed, but the size of the results set that Google can compile for the query.

added:
In Google's original engine, they compiled a results set of about 40,000 pages, which they then ranked according to certain criteria. I'm suggesting that, when they can get a suitably sized results set for a query without needing to include low confidence pages, as they can for popular searchterms, then they don't include low confidence pages. But when they can't get a suitably sized results set, they do include low confidence pages. In that way, it isn't the searchterms themselves that decide whether or not a low confidence page is ranked, but the size of the results set.

Last edited by PhilC : 01-09-2006 at 08:51 PM.
PhilC is offline   Reply With Quote
Old 01-09-2006   #94
Marcia
 
Marcia's Avatar
 
Join Date: Jun 2004
Location: Los Angeles, CA
Posts: 5,476
Marcia has a reputation beyond reputeMarcia has a reputation beyond reputeMarcia has a reputation beyond reputeMarcia has a reputation beyond reputeMarcia has a reputation beyond reputeMarcia has a reputation beyond reputeMarcia has a reputation beyond reputeMarcia has a reputation beyond reputeMarcia has a reputation beyond reputeMarcia has a reputation beyond reputeMarcia has a reputation beyond repute
Quote:
Is that a response to my post, Marcia?
No Phil it wasn't (I posted before you did, just hadn't hit enter yet), but the confidence score is a good point, apart from arbitrarily being 100% age-dependent.

So how come some sites never experience the "effect" and don't ever get hit with it, while others have found ways to get out from under it?

I'm not convinced it's totally number of results available to return for a search, because I've got a site that never got sandboxed and it started out ranking for search terms with from 200K pages returned on up to close to 500K pages returned and never hit the sandbox; it's been steady like that all along. It's now ranking for a search term that's got close to 2 million pages returned for it - after 5-6 months. BTW, it's not a commercial site and while there may be plenty of pages returned and those initial search terms get looked for a lot, there's no commercial value.

Last edited by Marcia : 01-09-2006 at 09:30 PM.
Marcia is offline   Reply With Quote
Old 01-09-2006   #95
PhilC
Member
 
Join Date: Oct 2004
Location: UK
Posts: 1,657
PhilC has much to be proud ofPhilC has much to be proud ofPhilC has much to be proud ofPhilC has much to be proud ofPhilC has much to be proud ofPhilC has much to be proud ofPhilC has much to be proud ofPhilC has much to be proud ofPhilC has much to be proud of
If there's a confidence score, we don't know how they come by it - we don't know what a site needs to have (apart from time) to get a good enough score to be ranked properly, so we don't know if your site had what it takes.

200k results isn't a lot, and it's possible that the searchterms weren't popular enough to make many of the 200k pages rank well for it, but they got there because they satisfied the criteria for only one of the searchterm words. Perhaps the compiling algo gets what pages it can that contain all the words, including low confidence pages if necessary, plus all the pages it can that contain fewer words - if you see what I mean. I haven't worded that very well.

I'm suggesting that the searchterms weren't popular enough to fill a 40k results sets without including low confidence pages. Don't forget that the results set isn't the 200k or n million results etc. They get a small results set regardless of how many actual results there are.
PhilC is offline   Reply With Quote
Old 01-10-2006   #96
andrewgoodman
 
andrewgoodman's Avatar
 
Join Date: Jun 2004
Location: Toronto
Posts: 637
andrewgoodman is a name known to allandrewgoodman is a name known to allandrewgoodman is a name known to allandrewgoodman is a name known to allandrewgoodman is a name known to allandrewgoodman is a name known to all
There is also personalization to factor into the mix -- and, I naively hope, coming soon... better personalization.

Experts at the engines tell us that users wouldn't want to set a dial to make "page freshness" a preference for them, but one way or another, SE's are going to privilege fresh pages on your behalf, in different ways. (Or, they'll punish them.)

True, but I still love playing with that feature on MSN Search. (Or at least I do in theory. The feature is too one-dimensional to be effective.)

Of course freshness is something you measure on established sites. Fresh pages on fresh sites... another matter worthy of sandbox-like discourse.

In any case, if you were to ask me, I'd say having 40 results that don't include more than a handful of new pages is a potential negative, but then again, I suppose that's query-dependent. On a stable term, you get "stable pages". On a "hot" term, perhaps freshness matters. Which is too bad, because that means my blog post on Scarlett Johansen which has ranked in the top ten for over a month now is soon going to cool off.

You can only assume there is so much for the SE's to consider in matching pages with users, that it would be wrong to get too down in the dumps about there being a permanent "sandbox" affecting all new ventures.

P.S. I don't like the idea of buying an old domain and bolting your new site onto it, because domain age is probably going to get downplayed as a criterion if too many sites start doing that. Plus, if you've chosen your company name carefully, why would you go out and buy up some other name?? On the other hand, acquiring established sites that others are undervaluing could be a smart move.

Last edited by andrewgoodman : 01-10-2006 at 02:33 AM.
andrewgoodman is offline   Reply With Quote
Old 01-10-2006   #97
Jill Whalen
SEO Consulting
 
Join Date: Jul 2004
Posts: 650
Jill Whalen is just really niceJill Whalen is just really niceJill Whalen is just really niceJill Whalen is just really niceJill Whalen is just really nice
Glad to see you finally drank the koolaid, Andrew. Now can you pour a glass for Mike?
Jill Whalen is offline   Reply With Quote
Old 01-10-2006   #98
Robert_Charlton
Member
 
Join Date: Jun 2004
Location: Oakland, CA
Posts: 743
Robert_Charlton has much to be proud ofRobert_Charlton has much to be proud ofRobert_Charlton has much to be proud ofRobert_Charlton has much to be proud ofRobert_Charlton has much to be proud ofRobert_Charlton has much to be proud ofRobert_Charlton has much to be proud ofRobert_Charlton has much to be proud ofRobert_Charlton has much to be proud of
Quote:
Originally Posted by Jill Whalen
Glad to see you finally drank the koolaid, Andrew. Now can you pour a glass for Mike?
I think Mike's drinking champagne now and doesn't drink Kool Aid anymore.
Robert_Charlton is offline   Reply With Quote
Old 01-10-2006   #99
Jill Whalen
SEO Consulting
 
Join Date: Jul 2004
Posts: 650
Jill Whalen is just really niceJill Whalen is just really niceJill Whalen is just really niceJill Whalen is just really niceJill Whalen is just really nice
Maybe if we tell him it's Merlot?
Jill Whalen is offline   Reply With Quote
Old 01-10-2006   #100
claus
It is not necessary to change. Survival is not mandatory.
 
Join Date: Dec 2004
Location: Copenhagen, Denmark
Posts: 62
claus will become famous soon enough
nevermind...

You're all entitled to your opinions, and I'm not really getting any kicks out of being a rebel these days, but I stick to my own opinion nevertheless...

All I'm saying is that if you think you can rank for [Texas Holdem] in a month just by filling a few thousand pages with related words and throwing a few hundreds (or thousands) links up, then your strategy might not be as long term as you would want it to be. It just might evolve to be a thing of the past, if it's not there already.

Thinking a bit ahead, if you're simply doing what anybody else with sufficient ressources could be/are doing, and you're not among the first doing it, then why should you rank at all? You know how many people climb Mount Everest each year? Would you like to carefully examine a list of the names?

Whatever... I don't think I've got more to add at this moment.
claus is offline   Reply With Quote
Reply


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off