everett sizemor
08-15-2006, 12:57 PM
I am going to paste in an email conversation I am having with our IT guy in charge of all server-side website issues. The gist of this conversation is: What happens if we rewrite our dynamic URLs AND redirect the old version to the new version. Will that create some kind of inescapable loop? I don't think it will, but please read the conversation below and reply if you have experience with massive URL rewrites with redirects. Our site is too big for us to be messing around without exploring the repercussions first. This is why I have come here seeking advice from those with experience in this area. I am looking for replies from people who have successfully pulled this off without destroying their rankings, rather than from people who know about the ‘concept’ of Mod URL Rewrites without having done it.
Start from the bottom up. The first email below is my last reply to him.
Thank you sooooo much for your input!
- - - -- - -
Thanks Itguy.
What I’m concerned about is that the old URLs will not be able to “work their way out of the system” because the search engines will continue visiting them until we tell them to stop. The question is, how do we tell them to stop indexing the old URLs other than applying a 301 redirect from the old version to the new one? Can we apply a nonindex meta tag to the old version without having it show up on the new one? Could we disallow the old versions in the robots.txt file without disallowing the new versions as well?
You are right to be concerned about looping issues. We can test this by spidering the site, however. I think because we are redirecting the URL on the back end, rather than using a page refresh that would show up on the new page as well, redirecting URL-A to URL-B shouldn’t be a problem.
Here is an example of the expressions I use on one of my sites to both rewrite and redirect the canonical versions of a URL. This has not created any looping issues for me:
RewriteEngine On
RewriteBase /
RewriteCond %{HTTP_HOST} !^www\.example\.com$
RewriteRule ^(.*)$ http://www.example.com/$1 [R=301,L]
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /(([^/]+/)*)index\.html\ HTTP/
RewriteRule index\.html$ http://www.example.com/%1 [R=301,L]
It ensures that there is only one way to access the home page as opposed to www., non-www. Index.html and non-index.html . There is a 301 appended to the end of the rewrite rules and everything seems to work fine. The home page of my sites gets indexed and crawled easily.
By the way, the canonical URL fix is also on the list somewhere, but we need to figure out what’s going on with the home page before going down that path. Another story…
It sounds like we’re making some headway. Thank you for all of the hard work Itguy.
Regards,
Everett
________________________________________
From: Ak, Itguy
Sent: Tuesday, August 15, 2006 9:26 AM
To: Sizemore, Everett
Cc: C, Itprojectmanager
Subject: RE: pathing for dynamic URL rewrites
Ok, based on that I was able to write a rule that will allow us to use underbars in products that end with MSTR. So, that piece should not be optimized.
The last part with the redirects from the current URL's I'm still working with. That one is going to take some serious thought because what we potentially run into is an endless loop of rewriting the URL from one format to the other. We would be better off contacting sites which link to us and getting them to replace the links with the new format. We may also have to accept that there will be a transition period from one convention to the other where we take some knocks for having the page respond to 2 URL's until the old ones work their way out of the system.
I'll keep thinking about it and see if I can come up with anything, but so far no dice.
________________________________________
From: Sizemore, Everett
Sent: Monday, August 14, 2006 11:56 AM
To: Ak, Itguy
Cc: C, Itprojectmanager
Subject: RE: pathing for dynamic URL rewrites
For the spaces, what about something like:
http://forums.digitalpoint.com/showthread.php?t=107370
Regards,
Everett
________________________________________
From: Ak, Itguy
Sent: Monday, August 14, 2006 11:18 AM
To: Sizemore, Everett
Cc: C, Itprojectmanager
Subject: RE: pathing for dynamic URL rewrites
I don't really see a way to get rid of the spaces. The pathing still needs to match up to the required parameters and, in the case below, the product id has a space in it. If I substitute in an underbar or something I'm going to break the link because I've changed the product id. How bad is it for us to have spaces? I mean, are we talking not ideal, or catastrophic?
With the category name below there is no space involved. Categories have both names and display names. The name are internally used markers which generally have no spaces to make them more code friendly. The display names can have spaces and special characters (i.e. registered, trade-mark, etc.). What you'll want to work on with Jason is the display names that show up in the title bar as opposed to the simple name which is what is used for look-up, etc.
Finally, no. The old links will continue to work, but they are not going to redirect as we would then be taking a link, redirecting to a rewrite which then points back to the original link. Too much overhead for a single page load.
________________________________________
From: Sizemore, Everett
Sent: Monday, August 14, 2006 11:05 AM
To: Ak, Itguy
Cc: C, Itprojectmanager
Subject: RE: pathing for dynamic URL rewrites
Itguy,
Everything looks perfect except for the space in the product URL that forces in a %20 between the skew and MSTR. Is there a way for us to rewrite the %20 and %5 signs to a dash or some other SE-friendly character?
Is there supposed to be a space between “floor” and “cleaner” in that URL? If so, the same issue applies. There should be no spaces in a URL. Underscores and dashes are fine, however.
This might be remedied once we get a new category naming procedure in place. You might want to discuss your plans with Jason to make sure that this is going to work well with what he has planned for splitting up category names into multiple words (eg: HomeOutdoor becomes Home Outdoor). We really don’t need HomeOutdoor to be two words in the URL, but we will need it to be two words in the title.
As you roll these changes out, is the old URL going to get redirected to the new URL when a user types in the old one or comes from an old link?
Thanks!
________________________________________
From: Ak, Itguy
Sent: Monday, August 14, 2006 10:44 AM
To: Sizemore, Everett
Cc: C, Itprojectmanager
Subject: pathing for dynamic URL rewrites
I've come up with some expressions that I think will work for us with dynamic URL rewrites, but I wanted you to look over them and make sure the search engines would be happy with them before I go any further.
Here's what I've got
Category Level 1:
http://www.ourdomain.com/retail/<category name>/
eg:
http://www.ourdomain.com/retail/HomeOutdoor
Category Level 2:
http://www.ourdomain.com/retail/<category level>/<category name>/
eg:
http://www.ourdomain.com/2/AirQuality/
Shelf:
http://www.ourdomain.com/retail/<category level>/<category name>/
eg:
http://www.ourdomain.com/3/FilterPurifiersAir/
Product:
http://www.ourdomain.com/retail/product/<SKU>/<name>/
eg:
http://www.ourdomain.com/retail/06-9515 MSTR/Floor Cleaner/
or
http://www.ourdomain.com/retail/06-9515%20MSTR/Floor Cleaner/
Let me know if these are going to work for you.
Itguy
Start from the bottom up. The first email below is my last reply to him.
Thank you sooooo much for your input!
- - - -- - -
Thanks Itguy.
What I’m concerned about is that the old URLs will not be able to “work their way out of the system” because the search engines will continue visiting them until we tell them to stop. The question is, how do we tell them to stop indexing the old URLs other than applying a 301 redirect from the old version to the new one? Can we apply a nonindex meta tag to the old version without having it show up on the new one? Could we disallow the old versions in the robots.txt file without disallowing the new versions as well?
You are right to be concerned about looping issues. We can test this by spidering the site, however. I think because we are redirecting the URL on the back end, rather than using a page refresh that would show up on the new page as well, redirecting URL-A to URL-B shouldn’t be a problem.
Here is an example of the expressions I use on one of my sites to both rewrite and redirect the canonical versions of a URL. This has not created any looping issues for me:
RewriteEngine On
RewriteBase /
RewriteCond %{HTTP_HOST} !^www\.example\.com$
RewriteRule ^(.*)$ http://www.example.com/$1 [R=301,L]
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /(([^/]+/)*)index\.html\ HTTP/
RewriteRule index\.html$ http://www.example.com/%1 [R=301,L]
It ensures that there is only one way to access the home page as opposed to www., non-www. Index.html and non-index.html . There is a 301 appended to the end of the rewrite rules and everything seems to work fine. The home page of my sites gets indexed and crawled easily.
By the way, the canonical URL fix is also on the list somewhere, but we need to figure out what’s going on with the home page before going down that path. Another story…
It sounds like we’re making some headway. Thank you for all of the hard work Itguy.
Regards,
Everett
________________________________________
From: Ak, Itguy
Sent: Tuesday, August 15, 2006 9:26 AM
To: Sizemore, Everett
Cc: C, Itprojectmanager
Subject: RE: pathing for dynamic URL rewrites
Ok, based on that I was able to write a rule that will allow us to use underbars in products that end with MSTR. So, that piece should not be optimized.
The last part with the redirects from the current URL's I'm still working with. That one is going to take some serious thought because what we potentially run into is an endless loop of rewriting the URL from one format to the other. We would be better off contacting sites which link to us and getting them to replace the links with the new format. We may also have to accept that there will be a transition period from one convention to the other where we take some knocks for having the page respond to 2 URL's until the old ones work their way out of the system.
I'll keep thinking about it and see if I can come up with anything, but so far no dice.
________________________________________
From: Sizemore, Everett
Sent: Monday, August 14, 2006 11:56 AM
To: Ak, Itguy
Cc: C, Itprojectmanager
Subject: RE: pathing for dynamic URL rewrites
For the spaces, what about something like:
http://forums.digitalpoint.com/showthread.php?t=107370
Regards,
Everett
________________________________________
From: Ak, Itguy
Sent: Monday, August 14, 2006 11:18 AM
To: Sizemore, Everett
Cc: C, Itprojectmanager
Subject: RE: pathing for dynamic URL rewrites
I don't really see a way to get rid of the spaces. The pathing still needs to match up to the required parameters and, in the case below, the product id has a space in it. If I substitute in an underbar or something I'm going to break the link because I've changed the product id. How bad is it for us to have spaces? I mean, are we talking not ideal, or catastrophic?
With the category name below there is no space involved. Categories have both names and display names. The name are internally used markers which generally have no spaces to make them more code friendly. The display names can have spaces and special characters (i.e. registered, trade-mark, etc.). What you'll want to work on with Jason is the display names that show up in the title bar as opposed to the simple name which is what is used for look-up, etc.
Finally, no. The old links will continue to work, but they are not going to redirect as we would then be taking a link, redirecting to a rewrite which then points back to the original link. Too much overhead for a single page load.
________________________________________
From: Sizemore, Everett
Sent: Monday, August 14, 2006 11:05 AM
To: Ak, Itguy
Cc: C, Itprojectmanager
Subject: RE: pathing for dynamic URL rewrites
Itguy,
Everything looks perfect except for the space in the product URL that forces in a %20 between the skew and MSTR. Is there a way for us to rewrite the %20 and %5 signs to a dash or some other SE-friendly character?
Is there supposed to be a space between “floor” and “cleaner” in that URL? If so, the same issue applies. There should be no spaces in a URL. Underscores and dashes are fine, however.
This might be remedied once we get a new category naming procedure in place. You might want to discuss your plans with Jason to make sure that this is going to work well with what he has planned for splitting up category names into multiple words (eg: HomeOutdoor becomes Home Outdoor). We really don’t need HomeOutdoor to be two words in the URL, but we will need it to be two words in the title.
As you roll these changes out, is the old URL going to get redirected to the new URL when a user types in the old one or comes from an old link?
Thanks!
________________________________________
From: Ak, Itguy
Sent: Monday, August 14, 2006 10:44 AM
To: Sizemore, Everett
Cc: C, Itprojectmanager
Subject: pathing for dynamic URL rewrites
I've come up with some expressions that I think will work for us with dynamic URL rewrites, but I wanted you to look over them and make sure the search engines would be happy with them before I go any further.
Here's what I've got
Category Level 1:
http://www.ourdomain.com/retail/<category name>/
eg:
http://www.ourdomain.com/retail/HomeOutdoor
Category Level 2:
http://www.ourdomain.com/retail/<category level>/<category name>/
eg:
http://www.ourdomain.com/2/AirQuality/
Shelf:
http://www.ourdomain.com/retail/<category level>/<category name>/
eg:
http://www.ourdomain.com/3/FilterPurifiersAir/
Product:
http://www.ourdomain.com/retail/product/<SKU>/<name>/
eg:
http://www.ourdomain.com/retail/06-9515 MSTR/Floor Cleaner/
or
http://www.ourdomain.com/retail/06-9515%20MSTR/Floor Cleaner/
Let me know if these are going to work for you.
Itguy