Search Engine Watch
SEO News

Go Back   Search Engine Watch Forums > Search Engine Marketing Strategies > Search Engine Optimization
FAQ Members List Calendar Forum Search Today's Posts Mark Forums Read

Reply
 
Thread Tools
Old 09-04-2009   #1
Onestop
Member
 
Join Date: Feb 2009
Posts: 72
Onestop is on a distinguished road
does https:// = duplicate content?

Somehow google found it's way onto a secure page of my site and started crawling. It indexed a bunch of non secure pages as https pages. I was afraid this would lead to duplicate content. I had tech support at my vendor help me block out the https in webmaster. However, once I did this it started a chain reaction that caused me to loose most of my indexed pages. Google is no longer accessing my site. Obviously there was a problem with how the file was set up. I want to delete it and open the door for google. If I do they might find the https again. Is this a problem?
Onestop is offline   Reply With Quote
Old 09-04-2009   #2
AussieWebmaster
Forums Editor, SearchEngineWatch
 
AussieWebmaster's Avatar
 
Join Date: Jun 2004
Location: NYC
Posts: 8,154
AussieWebmaster has a brilliant futureAussieWebmaster has a brilliant futureAussieWebmaster has a brilliant futureAussieWebmaster has a brilliant futureAussieWebmaster has a brilliant futureAussieWebmaster has a brilliant futureAussieWebmaster has a brilliant futureAussieWebmaster has a brilliant futureAussieWebmaster has a brilliant futureAussieWebmaster has a brilliant futureAussieWebmaster has a brilliant future
Re: does https:// = duplicate content?

I have https pages in the index - it is not a big deal
__________________
Bruce Gillmer - MTV EMA
AussieWebmaster is offline   Reply With Quote
Old 09-04-2009   #3
Onestop
Member
 
Join Date: Feb 2009
Posts: 72
Onestop is on a distinguished road
Re: does https:// = duplicate content?

Quote:
Originally Posted by AussieWebmaster View Post
I have https pages in the index - it is not a big deal
I guess I should qualify. I have pages like this . . .
https:// sitepage101.com
http:// sitepage101.com

That isn't duplicate content?

Last edited by jag : 09-05-2009 at 02:28 AM.
Onestop is offline   Reply With Quote
Old 09-05-2009   #4
jag
Forums Valuator, SEW
 
jag's Avatar
 
Join Date: May 2006
Location: CBE
Posts: 1,000
jag is a jewel in the roughjag is a jewel in the roughjag is a jewel in the roughjag is a jewel in the rough
Re: does https:// = duplicate content?

Again make sure the https files are set properly in server and the secure files are not crawled. https will go off as you block.

Best,
jag is offline   Reply With Quote
Old 09-05-2009   #5
Onestop
Member
 
Join Date: Feb 2009
Posts: 72
Onestop is on a distinguished road
Re: does https:// = duplicate content?

Quote:
Originally Posted by jag View Post
Again make sure the https files are set properly in server and the secure files are not crawled. https will go off as you block.

Best,
Jag. I guess I didn't explain very well. Secure files are not crawled. However, every time my third party vendor blocks spiders from crawling https urls they screw up the entire robot file. It causes google to drop most of my pages. I am removing their block in order to allow google back on the site. Eventually google will crawl one of the https portions of the site and then apply https to every page after that creating duplicate url's. Is this a problem? In my mind I don't care about the same url showing up as https and http as long as it shows up (which it doesn't with the block on).
Onestop is offline   Reply With Quote
Old 09-07-2009   #6
deanpowel71
Member
 
Join Date: Aug 2009
Posts: 109
deanpowel71 is on a distinguished road
Re: does https:// = duplicate content?

I have never seen https domains either with PR or zero, but only Grey. Guess Google doesn't show any details for https sites? But, both versions are indexed separately.
deanpowel71 is offline   Reply With Quote
Old 09-07-2009   #7
AussieWebmaster
Forums Editor, SearchEngineWatch
 
AussieWebmaster's Avatar
 
Join Date: Jun 2004
Location: NYC
Posts: 8,154
AussieWebmaster has a brilliant futureAussieWebmaster has a brilliant futureAussieWebmaster has a brilliant futureAussieWebmaster has a brilliant futureAussieWebmaster has a brilliant futureAussieWebmaster has a brilliant futureAussieWebmaster has a brilliant futureAussieWebmaster has a brilliant futureAussieWebmaster has a brilliant futureAussieWebmaster has a brilliant futureAussieWebmaster has a brilliant future
Re: does https:// = duplicate content?

No they do - look at this search http://www.google.com/search?q=https...ient=firefox-a
__________________
Bruce Gillmer - MTV EMA
AussieWebmaster is offline   Reply With Quote
Old 09-07-2009   #8
jag
Forums Valuator, SEW
 
jag's Avatar
 
Join Date: May 2006
Location: CBE
Posts: 1,000
jag is a jewel in the roughjag is a jewel in the roughjag is a jewel in the roughjag is a jewel in the rough
Re: does https:// = duplicate content?

Onestop, if you are not so concern you can leave. Not much problem. In general it is a good practice not to have same file indexed with both http and https. Anyway it may go off on repeated crawl when it is not found useful.

As well To allow Googlebot to index all http pages but no https pages, you'd use the robots.txt files below.

For your http protocol http:// yourserver.com/robots.txt:

User-agent: *
Allow: /

For the https protocol https:// yourserver.com/robots.txt:

User-agent: *
Disallow: /

Best,
jag is offline   Reply With Quote
Reply


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Similar Threads
Thread Thread Starter Forum Replies Last Post
Finding a Search Engine Friendly Content Management System rockcoastmedia Search Engine Optimization 16 07-11-2008 10:31 AM
Duplicate Content SEO Idiot Search Engine Optimization 1 07-18-2007 12:38 PM
RSS Syndication - Why is it not duplicate content? rockcoastmedia Blogs, RSS & XML Feeds 5 11-03-2006 06:58 PM
What is Content? - SES NYC 05 rustybrick SEM Related Organizations & Events 8 03-29-2005 04:46 AM
Duplicate Content Penalty jklein Google Web Search 0 01-12-2005 05:42 PM