Search Engine Watch
SEO News

Go Back   Search Engine Watch Forums > Search Engine Marketing Strategies > Search Engine Optimization
FAQ Members List Calendar Forum Search Today's Posts Mark Forums Read

Reply
 
Thread Tools
Old 08-01-2005   #1
dannysullivan
Editor, SearchEngineLand.com (Info, Great Columns & Daily Recap Of Search News!)
 
Join Date: May 2004
Location: Search Engine Land
Posts: 2,085
dannysullivan has much to be proud ofdannysullivan has much to be proud ofdannysullivan has much to be proud ofdannysullivan has much to be proud ofdannysullivan has much to be proud ofdannysullivan has much to be proud ofdannysullivan has much to be proud ofdannysullivan has much to be proud of
Indexing Summit 2: Give Your Feedback On Handling Redirects

Indexing Summit is returning for our next SES show. One of the two big topics will be on how search engines handle redirection. I've posted a Revisiting Hijacking & Redirects: Moving To A Solution article on the SEW Blog (URL to come shortly) that explains the situation with W3C rules, what Yahoo does to break some of those for good reason and what Google does that causes some problems. The goal is to get to an overall standard for all the major search engines to use. Your feedback for the summit is really helpful. How would you like to see things work? What unusual situations might come up that require special handling?

Last edited by dannysullivan : 08-01-2005 at 01:08 PM.
dannysullivan is offline   Reply With Quote
Old 08-01-2005   #2
claus
It is not necessary to change. Survival is not mandatory.
 
Join Date: Dec 2004
Location: Copenhagen, Denmark
Posts: 62
claus will become famous soon enough
Yahoo complies with the rules allright

Just a minor, albeit important, correction:

Quote:
Originally Posted by dannysullivan
the situation with W3C rules, what Yahoo does to break some of those for good reason
The rules are stated in RFC 2616. What these rules say is:

Quote:
Since the redirection might be altered on occasion, the client SHOULD continue to use the Request-URI for future requests
The important part here is the all-caps word "SHOULD". It is not in all-caps because it is extremely important to do exactly that (as one might think) but in stead because it is a special keyword with a distinct meaning. The distinct meaning of this keyword is described in another document, which is very important to understanding RFC's in general: RFC 2119 - Key words for use in RFCs to Indicate Requirement Levels

About the special keyword "SHOULD", this document says:

Quote:
SHOULD
This word, or the adjective "RECOMMENDED", mean that there may exist valid reasons in particular circumstances to ignore a particular item, but the full implications must be understood and carefully weighed before choosing a different course.
So, it's nowhere as strict as some people believe. It does not mean "must", "have to", "required" or anything like that. In stead, it's a soft recommendation like "If you've got no good reasons to do anything else, please do like this"

...which brings me back on topic:

By doing something else than what is stated in RFC 2119 for certain situations, Yahoo is actually acting in full compliance with both RFC's. They don't break the rules, don't even bend them. In fact, they act 100% as they're supposed to act according to these rules.

------------------------------------------

BTW:
Sorry I can't join you at the SES San Jose, Danny. Would like to, especially with this subject on the agenda, but it's just not possible.
claus is offline   Reply With Quote
Old 08-02-2005   #3
dannysullivan
Editor, SearchEngineLand.com (Info, Great Columns & Daily Recap Of Search News!)
 
Join Date: May 2004
Location: Search Engine Land
Posts: 2,085
dannysullivan has much to be proud ofdannysullivan has much to be proud ofdannysullivan has much to be proud ofdannysullivan has much to be proud ofdannysullivan has much to be proud ofdannysullivan has much to be proud ofdannysullivan has much to be proud ofdannysullivan has much to be proud of
Good point, Claus -- I'll go back and clarify and recall you'd made this point well in your article.

Recommendations is the better word -- but it's funny, because in talking with Yahoo, they clearly were uncomfortable to some degree against going against the recommendations even though the recommendations themselves recommend that on the odd occasion. Just wanted to use recommendations three times in that! They do feel they are doing the right thing, but there's that sense of somehow going against a standard. That's why I hope they'll all come up with a standard but one that makes sense for search engines.

Sorry you won't be there but would love to make sure any suggestions you have get passed along. I'll go back through your article for them. Chiefly, it seemed to be adopting the Yahoo approach.
dannysullivan is offline   Reply With Quote
Old 08-02-2005   #4
vladog
Newbie
 
Join Date: Jul 2005
Posts: 2
vladog is on a distinguished road
Thumbs up Possible solution: acknowledge from the target domain

Hi everybody,

Is it not a solution the target domain to acknowledge the 302 redirect by someway and unless that, the redirects to be treated as permanent by SEs. If the initiator of the redirect has control over the target domain it can acknowledge that the redirect is accepted.

This may be done by some of the following ways:
- A list of accepted redirecting domains may be included in robots.txt and the treatment of 302 redirects may be defined in some future version of Robots Exclusion Standard, which may become "Robots Instruction Standard"
- A meta tag may be used to acknowledge the 302 redirects.

The latest looks like the best shot, how do you think?

Hope this may help somehow.

Regards,
Vladimir Granitsky
vladog is offline   Reply With Quote
Old 08-03-2005   #5
Alan Perkins
Member
 
Join Date: Jun 2004
Location: UK
Posts: 155
Alan Perkins will become famous soon enough
I first saw this problem years ago - in 2001 or 2002 I think. I looked into a problem someone was having on iSearch and this Google 302 "pagejacking" turned out to be it. I reported it to Google, with a proposed solution (which was virtually identical to Yahoo's current solution) and, for a while, they fixed it. Then they broke it again. Over the intervening years they have fixed it and broken it a few times. I don't know why... they had it working once!

I think the problem is that some engineer has become too tied up in the meaning of "Temporary Redirect", and is following the W3C guidelines a little too closely for their own good.

The simplest solution IMO is as follows:
  1. SEs should treat 301s, 302s and 307s identically. Lots of indexed URLs change from one index to the next. So just rely on a regular refresh cycle to clear up problems with so-called "temporary redirects". Most webmasters (those who don't know about SEO) use temporary and permanent redirects interchangeably anyway.
  2. SEs should never index a URL that is a redirect. They should only ever index a URL that contains real content. Then the duplicate content problem (lots of URLs 302 redirecting to a single URL, content indexed multiple times) cannot apply.
  3. The exception to point 2 is when the redirect is from a root URL to a URL on the same domain. Then the content should be indexed under the root URL, and not under the redirect URL.
  4. To evaluate link popularity, a link from A->redirect*->B should be treated identically to a link from A->B, where redirect* is any number of redirects.
Alan Perkins is offline   Reply With Quote
Old 08-03-2005   #6
Rob
Canuck SEM
 
Join Date: Jun 2004
Location: Kelowna, BC
Posts: 234
Rob will become famous soon enoughRob will become famous soon enough
I disagree with the above points - about only handling redirects, if they are part of the same domain.

There are many legitimate reasons to redirect from domain to domain - such as branding issues. For example, I work with clients who spend years trying to buy their branded domain from a squatter, so they register another one in the mean time. So when it comes time to move to the new domain that they've always wanted, why should they be penalized for performing legitimate 301's from the old domain to the branded one?

Personally I like the idea (however cumbersome) of acknowledging the redirect via a meta tag. That is something that could be explored I think.
Rob is offline   Reply With Quote
Old 08-03-2005   #7
Alan Perkins
Member
 
Join Date: Jun 2004
Location: UK
Posts: 155
Alan Perkins will become famous soon enough
Quote:
Originally Posted by Rob
I disagree with the above points - about only handling redirects, if they are part of the same domain.
If you were referring to what I posted, then I think you have misunderstod.

All redirects should be handled. It's just a matter of which URL a piece of content is indexed under.

I'm suggesting that a piece of content should always be indexed under the URL that directly references that content, without a redirect.

I'm suggesting that the only exception to this should be when a "home page" on the same URL redirects to that piece of content, then the content should be indexed under the home page URL rather than the URL which actually addresses it.
Quote:
Originally Posted by Rob
So when it comes time to move to the new domain that they've always wanted, why should they be penalized for performing legitimate 301's from the old domain to the branded one?
They shouldn't, which is exactly what I suggested.
Alan Perkins is offline   Reply With Quote
Old 08-03-2005   #8
Rob
Canuck SEM
 
Join Date: Jun 2004
Location: Kelowna, BC
Posts: 234
Rob will become famous soon enoughRob will become famous soon enough
I guess my point is if you 301 a root url to another url for legitimate reasons, the way I read your post Alan was that the engine should not give credit for the redirect and instead treat it as a link?

Or am I still misunderstanding something?

Don't get me wrong - I like the idea of tackling this whole hijacking issue but my concern is for those legitimate sites which seem to always get swept out with the trash when the engines perform a major update.
Rob is offline   Reply With Quote
Old 08-03-2005   #9
Alan Perkins
Member
 
Join Date: Jun 2004
Location: UK
Posts: 155
Alan Perkins will become famous soon enough
Quote:
Originally Posted by Rob
I guess my point is if you 301 a root url to another url for legitimate reasons, the way I read your post Alan was that the engine should not give credit for the redirect and instead treat it as a link?
I'm not sure I understand the problem. What's the difference, as you see it?

A redirect from "http://www.siteA.com/" to "http://www.siteB.com/" should mean that the content of "http://www.siteB.com/" is indexed under the URL "http://www.siteB.com/". All links to "http://www.siteA.com/" should be treated as links to "http://www.siteB.com/". I think this is what you want.

A redirect from "http://www.siteA.com/" to "http://www.siteA.com/subpage.htm" should mean that the content of "http://www.siteA.com/subpage.htm" is indexed under the URL "http://www.siteA.com/".
Alan Perkins is offline   Reply With Quote
Old 08-03-2005   #10
Rob
Canuck SEM
 
Join Date: Jun 2004
Location: Kelowna, BC
Posts: 234
Rob will become famous soon enoughRob will become famous soon enough
Yes that is what I was thinking...

Sometimes, you know, its just easier talking to get a message across
Rob is offline   Reply With Quote
Old 08-03-2005   #11
claus
It is not necessary to change. Survival is not mandatory.
 
Join Date: Dec 2004
Location: Copenhagen, Denmark
Posts: 62
claus will become famous soon enough
No "accept redirects from" header, please

At first thought it may seem like a great thing, because then somebody is in control. But that is a false impression, as "somebody" is always in control: The sender of the traffic. And that's the right person.

I fully understand that the people at Yahoo (or Google, or anyone) have respect for this system - some really bright people have thought about all these things (and more) in the early days of the internet, and they knew so much more than we will ever do because they had the full picture back then when the web was relatively small and simple. It's really a complete system of interconnected rules and recommendations, so that if you change the way you do one thing you risk affecting another that you didn't even know about.

An "Accept redirects from" header would be working totally against some of the useful things that you would really use a 302 for (and for which it was intended in the first place). Specifically, with a 301 or a 302 control is always on the sending end, never on the receiving end. The sender is in full control, and it was always intended that way: Only the sender should be able to control where his/her traffic goes.

One such example of a useful thing is a "coral cache". This is a thing that you would use if one of your pages suddently became hugely popular, so that your server had problems following the demand. What happens then is that a network of some hundred servers takes over and mirrors a copy of that page, so that everybody can access it.

The way this works is that you simply issue a 302 to a special URL that instructs these cache servers to deliver a copy of your page. When the peak load is over you just remove the 302. And here's the point: The receiving servers never know which pages will redirect to them or when - it's all automatic. If they had to authorize pages linking in, such systems would never work. It's situations like this 302īs were intendend for, way back when. Link counting scripts came much later, but these should really use the 302 code as well (just like most of them do), as links do change.

(In all respect, the "Cool URI's Don't Change" piece never was more than a pipe dream. Cool URI's have change management: 301's and 302's)

So, the error is not with the sender (the issuer of the 302) and it's not with the receiver either. Neither the redirecting site, nor the receiving one is broken -- or doing something wrong -- in any way. The error is only with the client, which is the search engine (except Yahoo, of course). Which also means that this can only be fixed at the client side (the SE) because, frankly speaking, the client is broken.

-- -- --

Yes, I totally agree with the Yahoo approach. First time I heard about it I was actually against it, as they do treat a 302 like it is a 301, (and there is an important difference), but it works just right in this context. So, it "sounds wrong" but "works right" - I'll prefer that any day to something that sounds right but doesn't work. Also, that's the general idea behind those recommendations; if something else works better, all things considered, you should do that in stead.

It might be that they found out only by accident and because they focused on doing something quick, but then they instinctively did exactly the right thing.

Alan Perkins suggestion #4 also makes much sense to me - treat those status codes exactly like any other link. Don't make any difference at all. Take only the URL that the content is found on, no more and no less, and rely on the next spidering round to change the link if the page is now somewhere else.

Of course, then they loose some information, but that information could be stored for internal use anyway. It's not all the SE's got that we get to see in the SERPS anyway
claus is offline   Reply With Quote
Old 08-04-2005   #12
Alan Perkins
Member
 
Join Date: Jun 2004
Location: UK
Posts: 155
Alan Perkins will become famous soon enough
Quote:
Originally Posted by Rob
Sometimes, you know, its just easier talking to get a message across
Yep.
Quote:
Originally Posted by claus
I totally agree with the Yahoo approach
It's good, but it can cause that duplicate content problem with multiple redirects within one site, and/or the wrong URL being displayed in the search results.

I'd prefer that URLs that returned a redirect were not indexed (unless they were the root URL of a domain or subdomain). Instead, the URLs they redirect to should be indexed.

I do think that, in practical terms, a redirect is a redirect whether it's a 301, a 302, a 307, a HTTP Refresh, a Meta Refresh or a client-side redirect (e.g. JavaScript or Flash). Even a one-frame frameset may be considered as a redirect.

Given this wide definition of "redirect", and applying the logic I posted earlier, gives the following sample results:

Code:
+---------------------------------------------------------------------------------------+
| Redirect Page               | Target Page                | Indexed URL                |
|-----------------------------+----------------------------+----------------------------|
| www.siteA.com               | www.siteB.com              | www.siteB.com              |
| www.siteA.com               | www.siteA.com/subpage.htm  | www.siteA.com              |
| www.siteA.com               | www.siteB.com/subpage.htm  | www.siteB.com/subpage.htm  |
| www.siteA.com/subpage1.htm  | www.siteA.com/subpage2.htm | www.siteA.com/subpage2.htm |
| www.siteA.com/subpage1.htm  | www.siteB.com/subpage2.htm | www.siteB.com/subpage2.htm |
+---------------------------------------------------------------------------------------+
Alan Perkins is offline   Reply With Quote
Reply


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off