PDA

View Full Version : How do you remove dynamically generated pages from indexes?


JJZ
02-03-2005, 06:03 PM
On a site with dynamic content there are often pages that expire, get pulled down, etc. But it's not the page (template/cgi/dll etc.) that is invalid, just some data. So the page is retrieved successfully by the HTTP server, usually an HTTP 200 code is generated, and an "error" message page is generated. The error doesn't doesn't look like one to a SE spider, though - it's just a page saying "This product is no longer available" or something similar.

Our problem is that we have sites with thousands of pages, and lot's of content that comes and goes. We're pretty successful in haven't most or all of it indexed by the search engines. But not so successful in having bad links into our site removed from the indexes.

How should it be done?

Gerardism
02-03-2005, 10:37 PM
There are two options you might want to consider. One is getting IP recognition software. This can check the users IP address before serving the pages and serve the pages to the spiders without sessionids (assuming the pages have sessionid and this is the main problem).

Also what you can do is have implement a URL re-writer for the whole site, so the URL is given to the search engine spider on the fly, so they are only indexing the one page and not the same page over and over again.

strategicrankings
02-04-2005, 12:42 AM
Another option is to send a 404 header after finding out that the product does not exist anymore, you still can display a friendly 404, but with the 404 header sent instead of the 200.

Hope this help.