ucool
06-01-2005, 11:00 PM
Ok guys, and particularly google guy, i warn you this is long, and complex. I also beg google guy to get in touch with me, since this is probalby the best and most difficult case of 302 redirect problem you will ever see. I also beg the
webmaster of webmasterworld not to move this to another place where it goes with the other 302 stuff that GoogleGuy might not see. I need GoogleGuy to see this, moreover, everyone does if this problem is ever to be fixed... and this is possibly the best case of it. Also, since its says here at:
http://googleguy.zorgloob.com/2005/04/googles-302-redirect-problem.asp
"Thank you for the feedback that people have given us about 302s. I'd be interested to hear if anyone sees a result where site:yoursite.com returns urls from domains other than yoursite.com. You might want to wait another few days before checking though, to give things time to get fully out. I have to duck out right now, but I'll try to stop by and give more details as things are more fully deployed."
I have such proof (and until very recently until i tried to solve this problem myself), and would still have it if I had not actively tried to fix the problem. If anyone knows another way of getting intouch with Google or GoogleGuy about this, please let me know.
Ok, here we go. The domain in question is www.warez.com. Before anyone barks about me about the nature of such a domain. We have recently acquired it. It operates a very well known P2P network Warez P2P, now in the top 10 of download.com, its intention is NOT software piracy. We are soon to release a brand new site with the same designers as ford + nintendo, and turn warez.com into a hip amateur music portal. But this problem we have with google i fear is going to undo my business, and i dont know what to do. www.warez.com is top of yahoo and msn for the keyword "warez", and this is how it should. It also used to be top of google. The current warez.com site was accepted into adwords for a long while, but this is where the problem starts.
Another site we acquired, www.warezcrawler.net, was set up to use the same zone on our server as warez.com. Both pages were serving adsense ads. One morning i get an email from google saying "both warez.com + warezcrawler have been
banned from serving adsense ads because the sites distribute client software.". To be fair to google, this does break their policy (although they break this rule with limewire's forums:
http://www.gnutellaforums.com/forumdisplay.php?s=&forumid=7).
Anyway , my google adsense account was not terminated, just they banned those urls. Fair enough i thought. Now, AROUND the same time, but not exactly the same time, suddenly www.warez.com droped out of the google index, and www.warezcrawler.net replaced its URL. It was still on the front page, but now 5th. So, immediately you might think, the problem lies in the fact both sites have the same content and an incorrect redirect happened... but this is not
what happened. After reading around on the problem, i conduted a search for the direct url "www.warez.com". The result was shocking, a site called www.goink.com had basically hijacked it. The content for www.warez.com was, title:
"search the web", description: "visit here to search the web", and the url in the status bar was "goink.com". The result was different depending on if you typed www.warez.com or warez.com. If you typed www.warez.com, the url for goink would be displayed, and this actually went to the goink.com site which was a sponsored listings landing page. But if you typed just "warez.com", the url for the hijack would be "goink.com", and for some bizare reason this went to warez.com!. Observe the following telnet request data:
"GET / HTTP/1.1
Host: www.goink.com
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.5) Gecko/20041107 Firefox/1.0
Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
HTTP/1.1 302 Object moved
Server: Microsoft-IIS/5.0
Date: Tue, 25 Jan 2005 01:30:32 GMT
Location: http://www.warez.com" (n.b. some bits removed)
So it was clear that infact www.warezcrawler.net was not the problem, this goink site was. A look at archive.org reveals that infact, the site had been redirecting to www.warez.com since 2002!
http://web.archive.org/web/*/goink.com
You can check practically any page time except for in the early 2000s, and its going to www.warez.com.
At this point i had not read the advice on this forum where i should have put the NOINDEX code on my site, and requested the removal of goink.com, which would have probably fixed the hijack. Instead, i went down the long road of contacting the owners of goink.com, and getting them to stop the redirect (they seemed clusless as to why it was happening), but they fixed it. So i waited patiently for the domain to be reincluded. At this point i had not touched www.warezcrawler.net
Two weeks later the problem got worse. a search for www.warez.com revealed a new hijack url! versiontracker.com!!. Versiontracker had listed our product Warez P2P on their site, and linked to the www.warez.com homepage using a meta refresh. The result was another hijack exactly the same. A search of the warez.com domain revealed a url like "tc.versiontrack.com/product/..." or something like that. I was stunned. So, i went down the route of contacting versiontracker. They agreed to remove the software, and changed their method of linking publisher sites to a javascript to avoid the problem in the future. At last i thought the problem would be fixed. How i was wrong.... After several weeks, a search of www.warez.com and warez.com now revealed a "dead" url. It was there, it didnt say "it couldnt be found", but it had no title, no description. It just read "warez.com/", just a blue link, no other text. So what had happened to it? Why was it lifeless? I considered the possibility that now the content of warezcrawler.net and warez.com was now causing a a conflict, and that could be the reason warez.com was missing. So, finally i thought, a 301
PERMANENT redirect to warez.com from warezcrawler.net would surely fix the problem... After a while of waiting.. the result?
www.warezcrawler.net disappeared from the index... it was no longer 5th, no longer showing under the term "warez". A search for www.warezcrawler.net revealed "no data on the site could be found". I had no idea what to do. After waiting a bit i decided that this was not the way to go. I removed the 301 redirect, and within a matter of days www.warezcrawler.net had appeared back in the index again, just as it was before!
A search for site:warez.com reveals other subdomain results for the domain, so i thought the site couldnt be banned! So , after reading that a removal request of hijackers results worked. I thought, lets give the domain a fresh start. I added the NOINDEX code to www.warez.com, and requested removal of the URL. It was removed succesfully within a few days. A search for "warez.com" now reveleas "no data can be found abotu the site", as opposed to just a lifeless blue link. I thought perhaps this was progress. So, now i thought, put the 301 permanent redirect from warezcrawler.net to warez.com and your sure to fix the problem. This was a week or so ago. The result? once again. www.warezcrawler.net has disappeared from the first page of warez, and is now just as dead as www.warez.com. This time im going to leave the redirect as its totally normal. I want to establish what is going on with www.warez.com, and why its just failing to spider. Its a class A domain with a lot of sites linking it, and its top of yahoo and msn for "warez". None of this "optimisation" talk is the reason for it not being spidered at all. Is it google adsense banning? Is it the hijacking?
What could it be? Current status is a search for warez.com or warezcrawler.net just returns "no data can be found". site:warez.com resturns search.warez.com results.
I cant find it now, but a note on the google site writes that a 301 redirect can take up to 6-8 weeks to follow. But i cant think the initial site would juts drop out of the index as a result? Also, perhaps removing my index url completely from google causes some kind of temporary banning? I only removed the index page, but ive read a robots.txt file puts a 180-temporary forced removal of the URL. Perhaps this would be the ultimate way to fix the problem? A full removal and 0.5 years without a listing.. But perhaps the thing would be fixed afterwards?
Anyway, im entirely at a loss. and i dont belive anyone has had quite a big problem as this regarding 302 redirect.
I have have emailed google with the "canonicalpage" title as recommended in other threads and by GoogleGuy, i have submitted on the help page with "reinclusion request" as suggested by another page referencing Googleguy again. Ive
re-applied to adsense to try and get my page spidered... As you can see, ive tried practically every possible combination of things to get the site back in the index...and still nothing!
I guess my final words are...PLEASE HELP GOOGLEGUY, and anyone that may think they know where to go from here...
Kind Regards,
Jay
contact: see profile
webmaster of webmasterworld not to move this to another place where it goes with the other 302 stuff that GoogleGuy might not see. I need GoogleGuy to see this, moreover, everyone does if this problem is ever to be fixed... and this is possibly the best case of it. Also, since its says here at:
http://googleguy.zorgloob.com/2005/04/googles-302-redirect-problem.asp
"Thank you for the feedback that people have given us about 302s. I'd be interested to hear if anyone sees a result where site:yoursite.com returns urls from domains other than yoursite.com. You might want to wait another few days before checking though, to give things time to get fully out. I have to duck out right now, but I'll try to stop by and give more details as things are more fully deployed."
I have such proof (and until very recently until i tried to solve this problem myself), and would still have it if I had not actively tried to fix the problem. If anyone knows another way of getting intouch with Google or GoogleGuy about this, please let me know.
Ok, here we go. The domain in question is www.warez.com. Before anyone barks about me about the nature of such a domain. We have recently acquired it. It operates a very well known P2P network Warez P2P, now in the top 10 of download.com, its intention is NOT software piracy. We are soon to release a brand new site with the same designers as ford + nintendo, and turn warez.com into a hip amateur music portal. But this problem we have with google i fear is going to undo my business, and i dont know what to do. www.warez.com is top of yahoo and msn for the keyword "warez", and this is how it should. It also used to be top of google. The current warez.com site was accepted into adwords for a long while, but this is where the problem starts.
Another site we acquired, www.warezcrawler.net, was set up to use the same zone on our server as warez.com. Both pages were serving adsense ads. One morning i get an email from google saying "both warez.com + warezcrawler have been
banned from serving adsense ads because the sites distribute client software.". To be fair to google, this does break their policy (although they break this rule with limewire's forums:
http://www.gnutellaforums.com/forumdisplay.php?s=&forumid=7).
Anyway , my google adsense account was not terminated, just they banned those urls. Fair enough i thought. Now, AROUND the same time, but not exactly the same time, suddenly www.warez.com droped out of the google index, and www.warezcrawler.net replaced its URL. It was still on the front page, but now 5th. So, immediately you might think, the problem lies in the fact both sites have the same content and an incorrect redirect happened... but this is not
what happened. After reading around on the problem, i conduted a search for the direct url "www.warez.com". The result was shocking, a site called www.goink.com had basically hijacked it. The content for www.warez.com was, title:
"search the web", description: "visit here to search the web", and the url in the status bar was "goink.com". The result was different depending on if you typed www.warez.com or warez.com. If you typed www.warez.com, the url for goink would be displayed, and this actually went to the goink.com site which was a sponsored listings landing page. But if you typed just "warez.com", the url for the hijack would be "goink.com", and for some bizare reason this went to warez.com!. Observe the following telnet request data:
"GET / HTTP/1.1
Host: www.goink.com
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.5) Gecko/20041107 Firefox/1.0
Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
HTTP/1.1 302 Object moved
Server: Microsoft-IIS/5.0
Date: Tue, 25 Jan 2005 01:30:32 GMT
Location: http://www.warez.com" (n.b. some bits removed)
So it was clear that infact www.warezcrawler.net was not the problem, this goink site was. A look at archive.org reveals that infact, the site had been redirecting to www.warez.com since 2002!
http://web.archive.org/web/*/goink.com
You can check practically any page time except for in the early 2000s, and its going to www.warez.com.
At this point i had not read the advice on this forum where i should have put the NOINDEX code on my site, and requested the removal of goink.com, which would have probably fixed the hijack. Instead, i went down the long road of contacting the owners of goink.com, and getting them to stop the redirect (they seemed clusless as to why it was happening), but they fixed it. So i waited patiently for the domain to be reincluded. At this point i had not touched www.warezcrawler.net
Two weeks later the problem got worse. a search for www.warez.com revealed a new hijack url! versiontracker.com!!. Versiontracker had listed our product Warez P2P on their site, and linked to the www.warez.com homepage using a meta refresh. The result was another hijack exactly the same. A search of the warez.com domain revealed a url like "tc.versiontrack.com/product/..." or something like that. I was stunned. So, i went down the route of contacting versiontracker. They agreed to remove the software, and changed their method of linking publisher sites to a javascript to avoid the problem in the future. At last i thought the problem would be fixed. How i was wrong.... After several weeks, a search of www.warez.com and warez.com now revealed a "dead" url. It was there, it didnt say "it couldnt be found", but it had no title, no description. It just read "warez.com/", just a blue link, no other text. So what had happened to it? Why was it lifeless? I considered the possibility that now the content of warezcrawler.net and warez.com was now causing a a conflict, and that could be the reason warez.com was missing. So, finally i thought, a 301
PERMANENT redirect to warez.com from warezcrawler.net would surely fix the problem... After a while of waiting.. the result?
www.warezcrawler.net disappeared from the index... it was no longer 5th, no longer showing under the term "warez". A search for www.warezcrawler.net revealed "no data on the site could be found". I had no idea what to do. After waiting a bit i decided that this was not the way to go. I removed the 301 redirect, and within a matter of days www.warezcrawler.net had appeared back in the index again, just as it was before!
A search for site:warez.com reveals other subdomain results for the domain, so i thought the site couldnt be banned! So , after reading that a removal request of hijackers results worked. I thought, lets give the domain a fresh start. I added the NOINDEX code to www.warez.com, and requested removal of the URL. It was removed succesfully within a few days. A search for "warez.com" now reveleas "no data can be found abotu the site", as opposed to just a lifeless blue link. I thought perhaps this was progress. So, now i thought, put the 301 permanent redirect from warezcrawler.net to warez.com and your sure to fix the problem. This was a week or so ago. The result? once again. www.warezcrawler.net has disappeared from the first page of warez, and is now just as dead as www.warez.com. This time im going to leave the redirect as its totally normal. I want to establish what is going on with www.warez.com, and why its just failing to spider. Its a class A domain with a lot of sites linking it, and its top of yahoo and msn for "warez". None of this "optimisation" talk is the reason for it not being spidered at all. Is it google adsense banning? Is it the hijacking?
What could it be? Current status is a search for warez.com or warezcrawler.net just returns "no data can be found". site:warez.com resturns search.warez.com results.
I cant find it now, but a note on the google site writes that a 301 redirect can take up to 6-8 weeks to follow. But i cant think the initial site would juts drop out of the index as a result? Also, perhaps removing my index url completely from google causes some kind of temporary banning? I only removed the index page, but ive read a robots.txt file puts a 180-temporary forced removal of the URL. Perhaps this would be the ultimate way to fix the problem? A full removal and 0.5 years without a listing.. But perhaps the thing would be fixed afterwards?
Anyway, im entirely at a loss. and i dont belive anyone has had quite a big problem as this regarding 302 redirect.
I have have emailed google with the "canonicalpage" title as recommended in other threads and by GoogleGuy, i have submitted on the help page with "reinclusion request" as suggested by another page referencing Googleguy again. Ive
re-applied to adsense to try and get my page spidered... As you can see, ive tried practically every possible combination of things to get the site back in the index...and still nothing!
I guess my final words are...PLEASE HELP GOOGLEGUY, and anyone that may think they know where to go from here...
Kind Regards,
Jay
contact: see profile