PDA

View Full Version : Updating A Big Site & Retaining Rankings


Molly
03-15-2006, 06:39 AM
Hi

Persently the company I am working for are looking at a project to rebuild a very large very well ranked website. Now one of the most important things about the project is that we do not lose them the persent ranking they have in the search engines.

Presently they 58,000 pages indexed by google. Could anyone give me some advice on how to keep a website of this size well indexed after a rebuild.

Thanks

Wail
03-15-2006, 07:11 AM
Hi Molly,

I would look at Google's sitemap XML (https://www.google.com/webmasters/sitemaps/) program first. With this you'll be able to quickly introduce any new URLs to Google and get reporting back on any old URLs Google may be aware of from the old site but which it can't currently get at.

MSN and Yahoo are beginning to take RSS 2.0 feeds as something close to a sitemap XML too.

Of course, the best bet is to re-use all your old URLs in your new site. This is often quite a challenge but if it is possible then it is certainly worth doing.

Let's say you can't re-use your old URLs. You're looking at the use of server side redirects to go in the place of each old URL and redirect both users and spiders to the new URL for that page. By default, servers and server side script will issue a '302' header but to get the very best transfer from one page to another you'll want to work with '301' redirects instead. I've just written a longer post about redirects (http://forums.searchenginewatch.com/showthread.php?p=76092) which may be helpful here.

If you're dealing with .html or .htm pages then this can get tricky as these suffices don't tend to support server side script. I've found that in Apache its best to re-configure the server to let it treat .html pages as, for example, as PHP. You can also use .htaccess and Mod_rewrite (http://httpd.apache.org/docs/1.3/mod/mod_rewrite.html).

In IIS, if you're dealing with .html and old .html suffixes, then you could create a virtual directory for each old URL and redirect via that. That'll be a tough task for your server and I suspect your infrastructure people will hate you. Suggest it first and then back down to the use of ISAPI filters (http://msdn.microsoft.com/library/default.asp?url=/library/en-us/vccore98/HTML/_core_isapi_extensions.3a_.filters.asp). Heh. ISAPI filters are also work for the server people to do but it'll look like a nice compromise. With ISAPI filters you're using regular expression rules (as you'll use with Mod_rewrite) to tell the server that when it sees a pattern like http://www.example.com/old/site/*anything*.html that that now translates as http://www.example.com/new/*anything*.asp.

As a back up have a nicely formatted custom 404 error page. Your users may well find old dead pages (as it's awfully hard to get all those redirects in correctly) and you don't want to loose them by showing them unhelpful error pages. A custom 404 error page with a search box or a sitemap is a good way to keep the user on your domain. Just make certain that your custom 404 error page continues to return the 404 header.

As for a more general "how will 50, 000 new pages be indexed?" question I'm afraid that depends very much on how search engine friendly it is. If you're using a nicely structured site with clean href anchors from page to page then search engines will have no problem indexing the site. If you're using JavaScript and forms then the search engines may never see many of these 50, 000 new pages. If you're using Flash or Ajax then your 50, 000 pages may be 'conceptual pages' only and your site may in fact only be one URL and therefore only one page for the search engines to find.

If your site currently as a good PageRank (http://www.google.com/technology/) then you can expect Google to crawl and index your new site more quickly than if you have a low PageRank. One of the main goals of going through all that redirect drama is to ensure that you're not throwing away any old PageRank and that you're transferring as much as possible to the new site.

There are SEOrs on these forums who will tell you that PageRank is worthless now. There is a debate. I agree that PageRank has no impact on page position. However, I do believe that PageRank is still an indication of how often Google will visit and how thorough each visit will be.

Once again, Google's sitemap XML is the best way to ensure that any target URLs are introduced to it.