PDA

View Full Version : Using URL Rewriting and special characters


EdmundIJones
03-27-2006, 05:23 PM
Hi there,

I'm in the process of updating my web site and I'm converting a very large 'index' from using a single page which accepts in parameters:
e.g.
http://www.warc.com/Search/IndexSearch/Browse.asp?txtLevel1=Advertising&txtLevel2=Economic%20and%20social%20effects%20of%2 0advertising&txtLevel3=Effects%20on%20consumption/markets

to display content to a URL Rewrite version
e.g.
http://www.warc.com/Search/Browse/Advertising_%26_Marketing_Communications/Below-the-line_and_other_communications/Direct_marketing/Direct_mail/

Firstly do you think this approach will yield better search results due to the pages looking as if they have their own directories and more importantly, some 'index' categories have characters like a & in them, when placing this in the url it displays as %26. will a search engine like Google mark the site down for having characters codes in the url? Is there a better way?

Thanks
Ed

Wail
03-28-2006, 02:15 PM
I would watch that you don't create too long URLs this way and that you don't end up with too many hyphens or underscores.

However, even if your URLs are long this way and even if you use quite a few hyphens (which are better than underscores for separation) you will see SERP advantages over using complex query strings.

I would go with the 'false directory' approach but make sure your URLs are 'hackable'. Ie, if you URL rewrite to www.example.com/directory/file-1/ then make sure *something* happens at www.example.com/directory/. The Google toolbar (some versions at least) does have an 'up a directory' button so we can speculate that that's a concept Googlebot is familiar with. Also, some of your users will try it too.

I would URL encode the ampersand if you can.

Your internal linking structure will be important. If you've a large index of pages then ensure that there are plenty of cross-links between them.

A pit trap - when you upgrade to URL rewriting make sure that your 404 error page continues to issue the 404 header code. Sometimes URL rewriting can change this.

Another pit trap - if you're using paths like www.example.com/directory/file-1/ then make sure that www.example.com/directory/file-1 also does something; ideally 301 redirect to www.example.com/directory/file-1/ as this will stop duplicate content and keep users sane.