PDA

View Full Version : HTTPS and Duplicate Content


ptmagnolia82
05-04-2005, 03:15 PM
Hello,

I've recently noticed that Google is indexing some of our pages as both http:// and https://. I'm concerned about a duplicate content penalty and am looking for possible solutions. I believe the spider finds the https:// on one of our log-in forms (which has to be secure) and jumps to a link from that page and the problem begins.

I could block the log-in page from the spider via the robots.txt file, but I'm not sure if this is the only place where the loophole is present. Is it possible to disallow access to any pages starting with "https://" using robots.txt, thereby making the solution a little more redundant site-wide, instead of implementing a patch to specific pages?

Thanks for the help!

Mikkel deMib Svendsen
05-04-2005, 03:33 PM
I have had this problem with several sites I am working on. The best solution is to change your links on secure pages to absolute links instead of realative links. On top of this you can add a small check on all pages as they are requested and redirect users (and spiders) if a page is called on the wrong server - so if a non-secure page is requested on https redirect them (with a 301) to the normal http version.

In any case, YOU should take action and not waite for the engines to do so. Sooner or later you do risk that they take some kind action and you never know which one. Usually it's not what you like them to do. Don't leave it up to the engines :)