PDA

View Full Version : Leave Pages Out of Sitemap - IXNE Dupe Content Penalty?


slurpyseo
03-16-2007, 03:33 AM
First post here. I have a client who has a rather large school related site. The selection process they created to drill down to a school is state/letter of city/city/letter of school/school. The site has 10's of thousands of pages that are dupe content due to the "select letter" issues. I think they may have a duplicate content penalty. Will implementing a sitemap, and only placing in the School Landing pages help me with the duplicate content issue? If I don't put the "duplicate" pages in a robots file, will that help?
Thanks!

JohnW
03-16-2007, 07:39 AM
>Will implementing a sitemap, and only placing in the School Landing pages help me with the duplicate content issue?

No this will not help at all. You will need to either rework the code so that regardless of the navigation path the pages can only show up with a single URI, or else block robots from indexing the duplicates.

A hybrid solution may be best, whereby you would rework the code to cause all duplicates to show up in individual folders, and then use robots.txt to disallow the folders containing the duplicates. There are a few other ways too, like dynamically generating robots meta tags for the duplicates.

slurpyseo
03-16-2007, 12:23 PM
John, thanks for the info. I've been wondering about that for a bit.

Would it penalize me to leave out pages in a Sitemap?

Also, we have a solution to condense these pages. But, it includes either orphaning thousands of those pages, or making them 404's. The development team doesn't have the technical prowess to deal with a 301 re-direct on that level/many pages. Question: would 404's be worse than thousands of orphaned pages? How long will I see a penalty for thousands of 404's for? How about a dynamic meta "do not follow" on the orphaned pages?

Thanks for the info, I'm now a convert to this forum.

JohnW
03-16-2007, 07:30 PM
>Would it penalize me to leave out pages in a Sitemap?

There is no penalty associated with this.

>would 404's be worse than thousands of orphaned pages?

An orphan is a page that that is not linked to from any other pages. So I assume that you mean you are no longer linking to these pages but they will somehow continue to exist. If the pages are supposed to be gone and they are not ever coming back you should either 301 them to the correct page or 410 them. 404 is not best for this situation.