PDA

View Full Version : duplicate page requirements


newreality
01-04-2005, 08:38 PM
I have heard that there are some known basic requirements -- for two or more pages within the same site -- to be labelled as "duplicate content".

I'm speaking in terms of number of character differences, order of appearance, etc.
Is anybody familiar with roughly what this involves?

- 50 , >150 , more? % as related to whole page?
- does this pertain to overall text/structure AND links/structure?

note: my link text is the same throughout -- but the connecting urls are different page per page.

As you might of guessed I'm coming up with many similiar pages by the nature of the site's content. While I'm not out to "outsmart" the engines, I too don't want to be runnning duplicate content in their eyes.

fathom
01-08-2005, 02:50 AM
I have heard that there are some known basic requirements -- for two or more pages within the same site -- to be labelled as "duplicate content".

I'm speaking in terms of number of character differences, order of appearance, etc.
Is anybody familiar with roughly what this involves?

- 50 , >150 , more? % as related to whole page?
- does this pertain to overall text/structure AND links/structure?

note: my link text is the same throughout -- but the connecting urls are different page per page.

As you might of guessed I'm coming up with many similiar pages by the nature of the site's content. While I'm not out to "outsmart" the engines, I too don't want to be runnning duplicate content in their eyes.

Realistically this is 100% dependent on level of use [total # of dup pages], link generations between pages of dup, and lastly (your question) how much per page can be duplicated.

Some duplication is inherent in every website - we naturally have identical nav bars on all pages so if you change the way you do nav bars you can make the body 'more' duped [in theory].

But in general - 'if' only 2 pages and both are from a commonly linked page - both will likely survive at 20% of page content being unique... will it survive a spam report and hand inspection - possibly. A single dup page isn't a major manipulation practice.

When you start getting to the 3rd, 4th - it no longer is an issue of straight usuability - and noone can truly give you guidance here.

I did have a client that did 15,000 pages and used ramdon content at about 20% of the page - they flamed out by month four with 14,800 penalized.

However, if this is for visitor usability and not manipulation then content that would have been dup'ed place in an iFrame - now you have '1' page linked to from multiple website locations while it visibly appears multiple times.

The cool thing you can do with the iFrame is position the viewing window so that only the content shows - if it appears in results [and orphaned from the website itself - the nav bars are available so that searcher whom click to it [via search engines] isn't limited to a single orphaned page.

newreality
01-08-2005, 11:08 AM
The situation is that I have calendars occurring over a period of years, about 12 years, with the only difference a shift in leap years.

On these pgs, same name links [for months] on (4) of the years go out to seperate pages but not for the rest of the years.

Sounds like I need to create unique content for these pages?

Worst case scenario, the "duped" pages won't be indexed, right? Four of the pages have been indexed. However these pgs form the most important of the site that others build toward, and this indexing was prior to adding (8) more.

Some pages, such as tables with very simple date are only slightly different and are almost the same by their nature. No ill-will intended. Calendars are just an example.

I've run one popular program on this, and it seems to be counting each day number as a seperate word, so according to it there are over 600 words per page, nearly all from calendars and their headers.

fathom
01-09-2005, 01:21 AM
Worst case scenario, the "duped" pages won't be indexed, right?

Well yes for starters.

But the problem can also extend to all pages linking to the dup'ed page as these are votes to a page that is penalized [or could be] thus the potential of being penalized themselves by association.

This further induces a 'hole' in your website link architecture where a block of pages are literally removed and no longer a factor for ranked results... avoiding 'dup'ed' as much as possible should be considered.

However, an iFrame is an elegant way to keep the usability side of 'dup'ed' without worrying about what search engine 'might do'.

Step by step guide on iFrames.

Note: the guides recommentations for not using specific attributes is valid for old browsers - but I wouldn't concern yourself with them - you will likely get very few "challenged ones"!

http://www.idocs.com/tags/frames/_IFRAME.html

An example of an iFrame: <IFRAME SRC="your-duped-content-this-page.htm" WIDTH="665" HEIGHT="1100" ALIGN="left"></iframe>

Add to source code in place of the dup'ed content - now you have an original page services multiple locations.