Search Engine Watch
SEO News

Go Back   Search Engine Watch Forums > Search Engines & Directories > Google > Google Web Search
FAQ Members List Calendar Forum Search Today's Posts Mark Forums Read

Reply
 
Thread Tools
Old 09-13-2008   #1
bikeman
Member
 
Join Date: Sep 2008
Posts: 35
bikeman is on a distinguished road
removing subdomains with robots.txt

My folder structure comprises a subfolde off root for each domain as follows:

root\maindomain
root\domain2
root\domain3

with each domain accessed via
www.domain1.com
www.domain2.com etc

However I also have a subdomain - www.root.maindomain.com - which can be used to access any other domain's subfolder:
www.root.maindomain.com/domain1/
www.root.maindomain.com/domain2/

I don't want www.root.maindomain/domain2/ etc listed so I placed a robots.txt in the root folder:

User-agent: *
Disallow: /

I am now concerned that because each domain is in a subfolder off \root\ that the robots.txt will also affect all the domains listings?

Does the folder structure have any effect on robots.txt?
bikeman is offline   Reply With Quote
Old 09-13-2008   #2
JohnW
 
JohnW's Avatar
 
Join Date: Jun 2004
Location: Virginia Beach, VA.
Posts: 976
JohnW has much to be proud ofJohnW has much to be proud ofJohnW has much to be proud ofJohnW has much to be proud ofJohnW has much to be proud ofJohnW has much to be proud ofJohnW has much to be proud ofJohnW has much to be proud ofJohnW has much to be proud ofJohnW has much to be proud of
Re: removing subdomains with robots.txt

Welcome to SEW bikeman. Not sure why there has to be such an odd structure but I assume it has to do with how your host handles addon domains?

>I don't want www.root.maindomain/domain2/ etc listed so I placed a robots.txt in the root folder:
User-agent: *
Disallow: /

Not 100% sure I understand your setup but I think what you want here is this

User-agent: *
Disallow: /domain2/

In normal situation, what you have proposed will block everything, including for example,
www.root.maindomain/
or
www.root.maindomain/index.html
etc.

Anyhow wouldn’t using 301s to force everything to a single correct URI be better than blocking stuff?
JohnW is offline   Reply With Quote
Old 09-13-2008   #3
bikeman
Member
 
Join Date: Sep 2008
Posts: 35
bikeman is on a distinguished road
Re: removing subdomains with robots.txt

the reason i have setup the subdomain www.root.maindoamin.com is simply because I have folders with domain unrelated stuff such as www.root.maindomain.com/stuff/ which I dont want index'd but I still want to able to access via the url.

I simply want to prevent all indexing of www.root.maindomain.com/stuff/
but allow indexing of
www.maindomain.com

the folder structure is:

\root\ <-- here is where I am putting the robots.txt

\root\stuff\ <-- this not to be index'd

\root\maindomain\ <-- this is www.maindomain.com and needs to be index'd

My thinking is that when engines visit www.root.maindoman.com they WILL find my robots.txt in the root dir and then stop, so wont index www.root.maindomain.com or www.root.maindomain.com/stuff/

But when they come across www.maindomain.com they will not find a robots.txt since there is not one present in folder \root\maindomain\ and they will then happily crawl/index www.maindomain.com

?

Last edited by bikeman : 09-13-2008 at 10:56 AM.
bikeman is offline   Reply With Quote
Old 09-14-2008   #4
JohnW
 
JohnW's Avatar
 
Join Date: Jun 2004
Location: Virginia Beach, VA.
Posts: 976
JohnW has much to be proud ofJohnW has much to be proud ofJohnW has much to be proud ofJohnW has much to be proud ofJohnW has much to be proud ofJohnW has much to be proud ofJohnW has much to be proud ofJohnW has much to be proud ofJohnW has much to be proud ofJohnW has much to be proud of
Re: removing subdomains with robots.txt

>I simply want to prevent all indexing of www.root.maindomain.com/stuff/
but allow indexing of
www.maindomain.com

For this, the robots file will need to be available here: www.root.maindomain.com/robots.txt
and contain
User Agent: *
Disallow: /stuff/

But each domain, sub-domain and sub-sub-domain (like www in this case) may need its own robots file. Hard to say without understanding your config. For example, make sure that /stuff/ folder content isn't still available for indexing via some other path like
root.maindomain.com/stuff/
www.maindomain.com/stuff/
maindomain.com/stuff/
etc.
JohnW is offline   Reply With Quote
Reply


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Similar Threads
Thread Thread Starter Forum Replies Last Post
301 on robots.txt Portran Other Google Issues 10 08-11-2008 04:45 PM
Removing dynamic pages with Robots.txt bradseo Dynamic Website and Technical Issues 0 04-20-2007 10:00 PM
Removing cahced robots.txt file from Google Index Jeff Martin Google Web Search 3 04-20-2007 04:08 PM
What to do with subdomains Kate Search Engine Optimization 0 07-27-2006 03:42 AM
Robots.txt & Security Issues orion Dynamic Website and Technical Issues 11 02-22-2006 09:06 AM