Search Engine Watch
SEO News

Go Back   Search Engine Watch Forums > Search Engines & Directories > Yahoo! > Yahoo Web Search
FAQ Members List Calendar Forum Search Today's Posts Mark Forums Read

Reply
 
Thread Tools
Old 10-11-2004   #1
Marcia
 
Marcia's Avatar
 
Join Date: Jun 2004
Location: Los Angeles, CA
Posts: 5,476
Marcia has a reputation beyond reputeMarcia has a reputation beyond reputeMarcia has a reputation beyond reputeMarcia has a reputation beyond reputeMarcia has a reputation beyond reputeMarcia has a reputation beyond reputeMarcia has a reputation beyond reputeMarcia has a reputation beyond reputeMarcia has a reputation beyond reputeMarcia has a reputation beyond reputeMarcia has a reputation beyond repute
"Industrial Strength White Hat" Cloaking Questions

Unfortunately, Yahoo's index is becoming more and more filled with pages that show the URL for certain sites, but because of links on the pages with 302 redirects they display the title, description and cache for the sites being linked to, which we're redirected to, and turn up for the search terms rather than the URL the for proper site.

What this is doing in some cases is causing some site owners to become very alarmed because of their page "disappearing" from the index. At times it may be accidental, with a site owner not knowing their script is doing that, but in other cases it's outright intentional and with full knowledge. Most of what I've seen appears to be on purpose. In fact, in some cases the 302 has caused some sites to disappear altogether from Google's index - by hijacking.

But this is about Yahoo, because I'm seeing more and more almost daily when surfing or being asked to look at people's sites and checking further. Not being a cloaker myself, I'd like to understand more, and have a couple of questions.

Question 1:
Assuming it's the actual target page content the rankings are based on, whose link popularity (and anchor text) is beng counted - the site linking out or the site linked to that's replaced with the other URL in the index?

Question 2:
We see the title, description and cache of the linked_to site - BUT - isn't it possible that in actuality, what we're seeing is what our user agent (like IE) is delivering, but in reality it is really cloaked pages being served to Yahoo?

Couldn't the "phony" cache be just a tactic to throw us off from knowing that there's been custom delivery suited to Yahoo's taste, assuming that the offending site is ranking for their actual targeted keywords - which we won't even know about?

Question 3:
This may be the most important of all. There's a growing number of site owners terribly upset because for no reason they can figure out, their sites are disappearing altogeher from the Yahoo index. Some may not be, but let's assume that the majority of those it's happened to really are clean sites, as the owners claim. Is there any recourse for them?

Question 4:
Would Yahoo remove, or can they be removing, clean sites in their entirety in cases like this, blaming them when they really aren't at fault at all, while hijackers get to stay in? Could that happen, and how much of a problem could all this turn out to be for a search engine's search quality?

Last edited by Marcia : 10-12-2004 at 06:57 AM.
Marcia is offline   Reply With Quote
Old 10-12-2004   #2
kenpomachine
lurker
 
Join Date: Sep 2004
Location: Spain
Posts: 35
kenpomachine is an unknown quantity at this point
I think part of the problem resides in Yahoo directory as well. One of our domains is in the directory, but it's not getting titles or content updated in Yahoo. I suppose you can call this a mirror site even though it must be feeding from the same database and server as the current one, but the IT department isn't able to redirect it (they use a windows server, and I guess they don't want to do the task).

So if on the one hand they're not updating sites in their index and on the other you have some guys redirecting pages, you end up with a pretty mess.

And then, we have the main site, which is still showing in the SERP, but had some pages "disappeared" from their previous #1 spot results. It's not the whole site which has disappeared, but only that page as far as I know, and it's still indexed. As far as I can tell, this page isn't a mirror of any other, isn't cloaked. So as far as question #4 goes, I don't think they're removing whole sites.

But then, these things may be only showing in yahoo.es and could be completely unrelated to what you're asking. And I'm also lost there. Sorry for the rant.
kenpomachine is offline   Reply With Quote
Old 10-12-2004   #3
Marcia
 
Marcia's Avatar
 
Join Date: Jun 2004
Location: Los Angeles, CA
Posts: 5,476
Marcia has a reputation beyond reputeMarcia has a reputation beyond reputeMarcia has a reputation beyond reputeMarcia has a reputation beyond reputeMarcia has a reputation beyond reputeMarcia has a reputation beyond reputeMarcia has a reputation beyond reputeMarcia has a reputation beyond reputeMarcia has a reputation beyond reputeMarcia has a reputation beyond reputeMarcia has a reputation beyond repute
I'm not 100% clear on these points:

Quote:
I think part of the problem resides in Yahoo directory as well. One of our domains is in the directory, but it's not getting titles or content updated in Yahoo.
Sites with Directory listings usually show the Directory title and description so those don't change even if the "real" title or page changes - main Yahoo here in the U.S. also. Is that what's not changing, or is it the site itself that's not being freshly indexed with updates, as shown in the cache?

Quote:
I suppose you can call this a mirror site even though it must be feeding from the same database and server as the current one, but the IT department isn't able to redirect it (they use a windows server, and I guess they don't want to do the task).
Are there two different sites, one of which is in the Directory and one that isn't?

Sites can be redirected with Windows servers, it's just done differently than with *nix/Apache. For that info you can ask over in the forum for dynamic sites; it's very simple.

I happened to come across this post this morning from back in July, it must have been when I first started noticing and paying attention to the redirects

http://forums.searchenginewatch.com/...read.php?t=658
Marcia is offline   Reply With Quote
Old 10-12-2004   #4
kenpomachine
lurker
 
Join Date: Sep 2004
Location: Spain
Posts: 35
kenpomachine is an unknown quantity at this point
Quote:
Originally Posted by Marcia
Sites with Directory listings usually show the Directory title and description so those don't change even if the "real" title or page changes - main Yahoo here in the U.S. also. Is that what's not changing, or is it the site itself that's not being freshly indexed with updates, as shown in the cache?
Ok, it should be that then, because the cache is like a week old. Funny thing is that MSN is showing the almost the same result, but he had gotten the title from the page.

Quote:
Originally Posted by Marcia
Are there two different sites, one of which is in the Directory and one that isn't?
It's only one site, with two domains. At least that's what I think from how it works. Actually one is domain and the other is a subdomain under the company domain. ie www.domain.es and the other www.supermarket.company.es company being different than domain
The subdomain isn't in the directory, only the domain, and isn't indexed as well.

Quote:
Originally Posted by Marcia
Sites can be redirected with Windows servers, it's just done differently than with *nix/Apache. For that info you can ask over in the forum for dynamic sites; it's very simple.
When I asked the IT manager for the redirection he told me it couldn't be done, but the engineer who was with him told otherwise. That's why I believe he (IT manager) don't want to bother with this. Problem is, the name of the domain is not being used in any advertising either on or offline and we're not getting as many inbound links as we should.
kenpomachine is offline   Reply With Quote
Old 10-13-2004   #5
Phoenix
Member
 
Join Date: Jun 2004
Location: Austin, Texas
Posts: 97
Phoenix is just really nicePhoenix is just really nicePhoenix is just really nicePhoenix is just really nicePhoenix is just really nice
I am not all too extremely familiar with this issue. I have read up on it, but not encounter it yet. Can you provide an example search query so we could take a look? But in regards to this question:

Quote:
Some may not be, but let's assume that the majority of those it's happened to really are clean sites, as the owners claim. Is there any recourse for them?
Use the "Help us improve your search experience. Send us feedback." link at the bottom of the page for the specific query, to inform Yahoo about the particular problem. If the url is missing, you can also try to submit it this way.

The other way I can think to fix this, but its just a guess, would be to have Yahoo completely remove the site from the index (if its being hijacked), and then submit the url again for spidering.
Phoenix is offline   Reply With Quote
Old 10-13-2004   #6
Brad
A Usual Suspect
 
Join Date: Jun 2004
Posts: 111
Brad is a jewel in the roughBrad is a jewel in the roughBrad is a jewel in the roughBrad is a jewel in the rough
I have seen this happen in an innocent context. The problem started in Yahoo ever since they rolled out their post-Inktomi crawler and algo. Prior to that there never seemed to be an issue.

I am seeing it happen with several directory scripts which seem to be PHP based, were the directory redirect link becomes the URL for the site being listed in Yahoo.

I have a site that this has happened too and the link pop, drops because Yahoo starts thinking ...link.php?id=XXXX is the real URL, which of course nobody is linking to. This despite the site in question being listed in the Yahoo directory.

Deleting the listing from the directory, did not solve the problem because then some PhpNuke site's redirect link started (innocently) showing up as the URL on the next crawl. It is like there is a queue of redirects waiting for their turn. All the sites using the redirects are innocently linking to my site and not trying to hijack.

The problem does not seem consistant. Most sites listed in the above directories are not having the problem. This leads me to speculate that there might be another factor at work?

Marcia I'm not sure if this is answering your question, but the redirect does effect link pop for the hijacked site. The above has been reported to Yahoo many months ago with no response.
Brad is offline   Reply With Quote
Old 10-13-2004   #7
fantomaster
Industrial-strength cloaker
 
Join Date: Sep 2004
Location: Belgium
Posts: 70
fantomaster is a glorious beacon of lightfantomaster is a glorious beacon of lightfantomaster is a glorious beacon of lightfantomaster is a glorious beacon of lightfantomaster is a glorious beacon of light
Industrial-strength cloaking explained

Just to set a few points straight since you're using our promo
string "Industrial-Strength Cloaking" within what is essentially an unrelated
context:

1. "Industrial-Strength Cloaking" (or IP Delivery) in the strict sense of the
term (as we use it, having invented the phrase) is conducted on separate
sites: one "Shadow Domain" (SD) and one "Core (or: Main) Domain" (CD),
the latter being the one you'd direct your human visitors to.
Mixing cloaked and non-cloaked pages on the same domain (and IP, for that
matter), while technically possible, is not recommended in view of the
engines' declared anti-cloaking policy.
It's only the search engine spiders that will be fed the cloaked domain's
content - and, hence, only the SD's ranking will be affected by the cloaking
efforts. There's ways and means to garner incoming links to SDs (albeit
obviously somewhat artificially), but as that's not the issue at hand I won't
go into it any further here.
What's important to note is that because the spiders won't be redirected,
there'll be no redirection for them to detect and, hence, no penalization.

2. "Hijacked" domains don't relate to cloaking at all and, frankly, the rationale
of your mentioning them within this context simply beats me. "Hijacking
domains" (a very rare occurrence in any case) may be many things but it's
certainly not "cloaking"!

3. Ever and again you'll find orphaned domains which used to be ranked (well
or not) when still active being taken over by a new proprietor. Typically, the
traffic they continue to generate will be redirected to some other target site,
be it infinitely (in which case it's a sort of parasitic traffic gobbling behavior
on the part of the new registrants or buyers but quite legit) or for a limited
time until the new domain owners have put their own proprietary content up.
This may account for at least part of the phenomenon you're describing.

Note that there's lots of inactive domains still being blithely listed with
Yahoo! (and in ODP) and hunting them out to pick up their traffic via
re-registration has turned into quite a cottage industry in its own right ...

4. Domains can disappear from search engine indices for tons of reasons, but
when analyzing the actual causes it's certainly not helpful to always
assume "cloaking" as the first and prime suspect. (Not saying that you did,
but it's a fairly common gut level response, usually with people who don't
know the least bit about cloaking in the first place. This is very much akin to
the "Here be dragons" maps of yore and more often than not just as
mythical ...)
The chances of losing a professionally crafted cloaked domain in a search
engine index, while technically quite real, are, in actual practice, extremely
remote. There's lots of reasons for this I won't bore you with here, but if
you'd like to discuss it at any length, perhaps you'd want to set up a new
thread. Thus, any professional analysis will assign the possibility of cloaking
being the cause a fairly low priority.

5. "Phony cache": sorry, but there simply ain't no such animal! A search
engine's cache is a snapshot of what the search engine spider has actually
crawled ("seen and stored", you might say). And the cache is what the
search engine ranking mechanism will normally take for a basis, though many
other factors will be added on top of the onsite criteria.
The only way a webmaster or -mistress can change cached content is by
setting up new content and having the spider crawl and reindex it - in
which case, again, it wouldn't be "phony".

However, there's methods of redirecting searchers who take a peek at the
SERP's cache.
This can be (and is being) done for lots of reasons, all of them quite
legitimate. Here's only three of them:
a) caching tends to break up relative links on a web page which may botch
up navigation, proper display of graphics, etc.;
b) the page content is copyright protected so the search engine has no right
in the world to store and use (display) it within another environment (frames,
SE tag in the header, etc.) without the copyright holder's express
permission;
3) a cache may be quite dated, depending on the search engine's indexing
cycle, so you wouldn't want the searcher to be confused by obsolete
information, etc. etc.

Last edited by fantomaster : 10-14-2004 at 01:24 AM. Reason: Edited some typos + ommissions.
fantomaster is offline   Reply With Quote
Old 10-14-2004   #8
Phoenix
Member
 
Join Date: Jun 2004
Location: Austin, Texas
Posts: 97
Phoenix is just really nicePhoenix is just really nicePhoenix is just really nicePhoenix is just really nicePhoenix is just really nice
Quote:
3) a cache may be quite dated, depending on the search engine's indexing
cycle, so you wouldn't want the searcher to be confused by obsolete
information, etc. etc.
I can atest to this. Just found an example of this tonight in Google. The following site has been live for around 3-4 months, but due to duplicate content, Google has reverted the cache to what it was before the site was live. Even though a month ago they displayed "real" cache. Look at the cache of the site (index page), and then visit the live site. See the difference? Kinda odd, but one way they deal with this.

http://www.google.com/search?hl=en&q...estments.co m
Phoenix is offline   Reply With Quote
Reply


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off