|
#1
|
|||
|
|||
|
removing subdomains with robots.txt
My folder structure comprises a subfolde off root for each domain as follows:
root\maindomain root\domain2 root\domain3 with each domain accessed via www.domain1.com www.domain2.com etc However I also have a subdomain - www.root.maindomain.com - which can be used to access any other domain's subfolder: www.root.maindomain.com/domain1/ www.root.maindomain.com/domain2/ I don't want www.root.maindomain/domain2/ etc listed so I placed a robots.txt in the root folder: User-agent: * Disallow: / I am now concerned that because each domain is in a subfolder off \root\ that the robots.txt will also affect all the domains listings? Does the folder structure have any effect on robots.txt? |
|
#2
|
||||
|
||||
|
Re: removing subdomains with robots.txt
Welcome to SEW bikeman. Not sure why there has to be such an odd structure but I assume it has to do with how your host handles addon domains?
>I don't want www.root.maindomain/domain2/ etc listed so I placed a robots.txt in the root folder: User-agent: * Disallow: / Not 100% sure I understand your setup but I think what you want here is this User-agent: * Disallow: /domain2/ In normal situation, what you have proposed will block everything, including for example, www.root.maindomain/ or www.root.maindomain/index.html etc. Anyhow wouldn’t using 301s to force everything to a single correct URI be better than blocking stuff? |
|
#3
|
|||
|
|||
|
Re: removing subdomains with robots.txt
the reason i have setup the subdomain www.root.maindoamin.com is simply because I have folders with domain unrelated stuff such as www.root.maindomain.com/stuff/ which I dont want index'd but I still want to able to access via the url.
I simply want to prevent all indexing of www.root.maindomain.com/stuff/ but allow indexing of www.maindomain.com the folder structure is: \root\ <-- here is where I am putting the robots.txt \root\stuff\ <-- this not to be index'd \root\maindomain\ <-- this is www.maindomain.com and needs to be index'd My thinking is that when engines visit www.root.maindoman.com they WILL find my robots.txt in the root dir and then stop, so wont index www.root.maindomain.com or www.root.maindomain.com/stuff/ But when they come across www.maindomain.com they will not find a robots.txt since there is not one present in folder \root\maindomain\ and they will then happily crawl/index www.maindomain.com ? Last edited by bikeman : 09-13-2008 at 10:56 AM. |
|
#4
|
||||
|
||||
|
Re: removing subdomains with robots.txt
>I simply want to prevent all indexing of www.root.maindomain.com/stuff/
but allow indexing of www.maindomain.com For this, the robots file will need to be available here: www.root.maindomain.com/robots.txt and contain User Agent: * Disallow: /stuff/ But each domain, sub-domain and sub-sub-domain (like www in this case) may need its own robots file. Hard to say without understanding your config. For example, make sure that /stuff/ folder content isn't still available for indexing via some other path like root.maindomain.com/stuff/ www.maindomain.com/stuff/ maindomain.com/stuff/ etc. |
![]() |
| Currently Active Users Viewing This Thread: 1 (0 members and 1 guests) | |
| Thread Tools | |
|
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| 301 on robots.txt | Portran | Other Google Issues | 10 | 08-11-2008 04:45 PM |
| Removing dynamic pages with Robots.txt | bradseo | Dynamic Website and Technical Issues | 0 | 04-20-2007 10:00 PM |
| Removing cahced robots.txt file from Google Index | Jeff Martin | Google Web Search | 3 | 04-20-2007 04:08 PM |
| What to do with subdomains | Kate | Search Engine Optimization | 0 | 07-27-2006 03:42 AM |
| Robots.txt & Security Issues | orion | Dynamic Website and Technical Issues | 11 | 02-22-2006 09:06 AM |