|
#1
|
|||
|
|||
|
Cloaking 101 - Questions and Answers
Fantomaster, our resident "industrial strength cloaker" very kindly agreed to let me take a post of his out of it's somewhat buried context and post it here for some quesitons. I found it through seobooks blog, thanks both of you
![]() Quote:
|
|
#2
|
|||
|
|||
|
>>There's ways and means to garner incoming links to SDs (albeit obviously somewhat artificially)
What kind of methods? - Setting up networks of your own sites under your own control? - blog comments? - any clarification on this would be most interesting... Quote:
Quote:
Quote:
![]() Thanks again, facinating subject, i dabble a little myself but im so far from fanto's league that it'd be great to get an informed discussion going on this without all the right/wrong stuff getting in the way... cheers Nick Last edited by Nick W : 10-15-2004 at 05:53 PM. |
|
#3
|
|||||
|
|||||
|
Phew, that's quite a set of questions you've uploaded for me here!
But ok, let's get started. Quote:
however much degraded and slandered they may have been by orthodox wisdom, can do the trick nicely. Obviously, you won't normally get many links from human edited directories, which is why some cloakers will revert to bait-and-switch once their sites have been indexed. We don't ourselves recommend doing this (it can create a lot of hard feelings and may be highly detrimental to your public image), but that's not to say that it won't work technically. Another good trick is installing the Google toolbar in spy, er, "enhanced" mode (i. e. activate PageRank) and surfing your SDs: this will send information to the Google servers and they'll send a spider along to check it out. While technically this approach doesn't give you any incoming links, we've seen pages getting ranked that way nevertheless. Don't hold your breath, though: getting good linkage is always a smart move so don't rely solely on this technique in lieu of it! Quote:
go for pertinent names for your SDs. Say your Core Domain is "bestwidgets.com". (Now haven't we heard that one before ... )If your SDs are named, say, "allwidgets.com" and "widgetsgalore.com", this will be some help when people are searching for "widgets" and see your SDs displayed in the SERPs. Since Google's SERPs will usually display pure concatenated gibberish for site descriptions, based on their trashy proximity algo, your title tag content and your domain name will count for everything. Note that this isn't the old and worn technique of including keywords in your domain names to achieve better rankings because they won't. But for users looking for widgets, finding their keyword in a domain name will simply be one notch more of an incentive to actually click on it. Then, when they arrive on "allwidgets.com" instead of the SDs proper, the irritation factor will be a lot less than if you were to redirect them to "1234567.com" or similar. So this is really more about searcher psychology than ranking. Quote:
and unless you have to pay through your nose to have such domains transferred to you if they're not free and to be had for the asking (aka new registration), this can give you just that bit of additional leverage you may require. Quote:
Let's stick to just a few points: detecting cloaking reliably typically requires quite a bit of manpower. Sure you can pre-filter cloaking indicators automatically, e. g. by accessing and saving the site first as a search engine spider and following this visit up via some unknown IP not identified as a spider. However, simply comparing the content automatically isn't reliable: there's so many sites out there displaying dynamic content, it can be a real nightmare discerning what is actually legit by the SEs' standards and what isn't. Moreover, there's browser specific content delivery, printer-friendly pages, content delivery management systems in general, etc. etc. All this will incur massive overhead in the personnel department and, hence, will blow up costs. Considering that you'd have to cover billions of pages to weed out all cloakers, it only stands to reason that the search engines, while certainly not endorsing the practice, prefer to view cloaked sites as just so much tolerable white noise in their search results. So while the risk of detection is indeed real, to all practical purposes it's more of an academic scenario. In any case, no need to wax paranoid about it. Of course, not everyone opts for "industrial-strength" cloaking - you'll still find tons of stupid outfits pretending to IP delivery via simplistic UserAgent redirection, low grade JavaScript redirects, refresh tag jokers, cheapos employing dated, incomplete or amateurish spider lists, etc. Those are quite easy to liquidate and good riddance, too! ![]() (I'm sure Mikkel could contribute a lot more regarding this issue from the search engines' point of view as that's exactly one of the fields he was focused on when working for that Scandinavian search engine of his.) Quote:
about cloaking or IP delivery: people getting all worked up about the "ethics" purportedly involved, trying desparately to be "white guys/gals" - and of course, expecting their due reward for the effort ... But as a client once responded when I put this issue to him: "What do you mean, 'ethics'? This is bloody war, man!" And so it is. Last edited by fantomaster : 10-16-2004 at 11:59 PM. Reason: Corrected some typos and omissions. |
|
#4
|
||||
|
||||
|
Thank you for your time fantomaster, your answers were very educating. Here are a couple of more questions for you:
(1) There are many reasons to use cloaking. But besides for not wanting your competitors to rip off your hard SEO work, not wanting the search engines to cache your pages and wanting to server the same graphical copy of your pages to the SEs in text format --- what would be other reasons? Keeping ethics aside, can you give us some other reasons why SEOs would deploy cloaking, outside of the reasons I listed above? In other words, what makes a "black hat SEO", really black? ![]() (2) You said in your response to NickW: Quote:
Thanks again for your time. |
|
#5
|
||||
|
||||
|
Quote:
![]() Quote:
conducive to business, at the end of the day all it does is add even more mystique to an already highly over-mystified topic. Let's remember that what we are dealing with here is a purely technical procedure. And it doesn't do a lot to help with having to cater to a fundamentally un-educated public full of false expectations and irrational hangups about what is, after all, just another technique of serving customized content to varying targets. IMV, imposing a Dungeons & Dragons type of language on this process constitutes little more than a severe case of confusing essentially unrelated paradigmes. (And yes, I did indeed note your putting the term in quotes - thanks for that! )Obviously, search engine representatives will not tire slamming cloaking (or again, much more precisely, IP delivery) as "trickery" and "unfair practice" as it allows boosting your SERP rankings without their express consent. What's more, cloaking is still the most effective and cost efficient approach to SEO extant. So yes, SEO is certainly a prime reason to go for cloaking for most if not all webmasters opting for this technology. What it all really boils down to is an issue of control: search engines - quite understandably - want to retain full control over their indices and search results. On the other hand, webmasters, web designers, layouters, marketing people, usability consultants etc. are - equally understandably - fairly disinclined to turn themselves into mere SE serfs by subjugating their highly expensive setups to the search engines' obscure ranking requirements and indexing antics. While there have admittedly been some improvements in recent years regarding the indexing of Flash, multimedia streams, dynamic pages, graphics etc., when push comes to shove you are still well advised to stick to HTML 1.2 for best ranking results if you're a "white hat" SEO or a plain webmaster desperately trying to make traffic and design ends meet without having to delve into the "rocket science of SEO" (another blatant misnomer, to be sure, but still a very virulent popular misconception). Session ids still giving search engine spiders the hiccups are just one very common case in point. When all is said and done, web entrepreneurs are focused on doing business with actual people rather than having to please, monitor, reverse engineer and cuddle some dumb ranking bots. So working around the search engines' confoundingly less-than-state-of-the-art tech is yet another pressing reason to make use of IP delivery. Quote:
currently domiciled in the German speaking part of Belgium. Quote:
If not, I'd have to beg Dirk, our CTO, to hunt it down in the logs. But knowing his ever-current workload and busy schedule, I'd hate distracting him so it might take a year or two. So the best I can tell you off the cuff is that we took notice of it several months ago. Surely, I wouldn't want to overrate this phenomenon, but it's definitely for real - so much so that we've resorted to making good use of it fairly regularly now. And when you come to think of it, it certainly makes a lot of sense: Google is, if anything else, a giant data mining hog because that's where the real money lies these days. So doling out that free toolbar of theirs and using it for tracking people's surfing habits is certainly shrewd policy - and if it helps you update your "real life index" as a sideline by being pointed to new stuff you haven't checked out yet, so much the better. In a way, it's like a primary election: still undecided who'll actually make it into the index in the end, but a pretty good indication of what's being viewed out there, hence probably some bonus system to set these links off from merely submitted (mechanically or otherwise) sites. (While this is still mere surmise on my part, here's a scenario you may be interested in: it seems more than plausible to envisage a time when search engines will start favoring web sites that are actually being visited by real people rather than lumbering along with the steadily increasing deadweight of sites nobody cares to visit in the first place. While linkage may not be discarded summarily as it would still serve as a useful fallback factor, seeing the extent to which Google has been demoting its much hyped but logically flawed PageRank algo for the better part of two years now, I'd be truly surprised if we saw it attaining to its previous glory again anytime soon, if ever. Remember the days when hit ranking was deemed the best invention since sliced bread? (DirectHit, wasn't it?) Well, maybe time is rife for resuscitating a brushed up version of that tech. It would merge ever so nicely with the currently prevailing PPC philosophy, I guess I may be forgiven for expecting this to happen rather sooner than later. But again: for now, that's pure speculation, so do pardon my rambling.) Last edited by fantomaster : 10-17-2004 at 01:39 AM. Reason: Corrected some typos. |
|
#6
|
|||||
|
|||||
|
Quote:
![]() Quote:
In my mind, cloaking provides a method to target an 'excessive' number of keyword combinations. By excessive, I mean, keyword combinations that you might think are 'spammy' to cover. Example; style + keyword phrase OR color + keyword phrase etc. Of course this can be done without deploying cloaking, but it makes sense to hide these pages from the users. Where am I going with this? Hoping to see if this is done often with cloaking and hoping to see what else is done. Quote:
![]() Quote:
Quote:
I am not too sure if it would be fair to use the Google Toolbar for ranking purposes. At least not until Google takes over Microsoft's control of the world. CTR works for PPC but barely because of the click fraud. I wonder how that will alter Google AdWord's ranking for AdWords ads. Anyway, this leads me to more uses for cloaking...In one of the many 101 threads being started in this forum, one is named Block Level Analysis 101 which describes Microsoft's research on attempts to look at a page's layout and determine which "blocks" on the page are more 'important' then other blocks. So naturally, header, footer, side nav and advertisements on the page would be worth less then the content in the middle. The goal is to combat the issues with PageRank and anchor text, i.e. the selling of text links to artificially inflate the rankings of certain pages. Now if this block level analysis is deployed, and I think it might come sooner then later, I can see cloaking come into play as a way around this. Keep in mind, its not a matter of the source of the page, its a matter of what the search engine sees (it takes a picture of the page layout). So if cloaked pages are deployed with text ads placed in the content area of the page, then text ads are now worth more. Of course, SEs solve most of the problems, but cloaked pages (the few out there as compared to the number of pages on the Web) take advantage of it. Last edited by rustybrick : 10-17-2004 at 01:22 AM. Reason: removed some stuff, to make it easier to read |
|
#7
|
|||
|
|||
|
Firstly, thanks fanto! - Really nice posts and quite inspirational i might add.... hehe...
Quote:
Example: this is a paragraph in the center of the page with nice kw rich content yada yada yada Now if we were to target 'center' -> the word 'center' would become a link to wherever... Cloak that filter for SE's only and you have your nice outgoing links... Back to Fanto I have a question to ask (dont groan!..) - a technicalish one. What is the fastest way to identify a spider? - Currently my little home made doodah just runs a foreach loop over the contents of 'ips.txt' and compares each one to the REMOTE_ADDR Is there a better way? Nick |
|
#8
|
||||
|
||||
|
Quote:
If SEs try to do this by looking at the source code only then, I don't think cloaking is needed. |
|
#9
|
||||||
|
||||||
|
Quote:
Plus, considering the inner mechanics of large corporations (say Fortune 1000 upward), more often than not it's simply not feasible - or, so they say. For them, it may take ages till everybody relevant in the food chain has agreed on issues like overall marketing strategies, branding, corporate identity, online PR, etc. Mind you, all this is basically pre-design stuff - it's what will spill down to the web site designer crowd or agency who will have to make do with what's deemed permissible. In most large companies SEO is usually a mere backburner issue which can easily turn into a case of wantin to eat one's cake and having it too. That's another field where cloaking will come in: as it allows you to leave the Core Domain's code untouched, management people favor it as a solution offering the best of two worlds. Small enterprises may be a different matter, though they in turn are notoriously short of personnel and more often than not can't even afford outsourcing the task at hand. For them, a time saving automated solution is usually the best option. In any case, I've always asserted that cloaking definitely isn't for everyone. It's not an easy alternative to conventional SEO, it's just one additional tool out of many, albeit an extremely powerful one. There's lots of scenarios where cloaking would constitute a blatant overkill. However, what with PPC becoming an increasingly dominant - and costlier! - only alternative, it's no surprise that more and more people are beginning to view cloaking as a viable approach to cut costs and shun click fraud issues in the process. Quote:
least not to any great extent with the obvious exception of being able to optimize pages by filling them with content unfit for human consumption but beloved by the robots. In other words: if keyword stuffing won't work in conventional SEO, neither will it do in cloaking. You're right, of course, that cloaking makes this possible - but whether it's the wise thing to do is quite another matter. Quote:
Quote:
![]() And what does 'fair' have to do with it? The other day I had a phone conference with some clients of ours and the first thing they asked straight away was: "So what's your position on cloaking for Google?" To which I replied: "Well, first - they don't like it. And second: it works." So guess what they opted for? Quote:
haven't focused a lot on page architecture so much but implementing it isn't a big issue - another case of the solution being there and ready even before the problem has surfaced. Did I hear someone say "arms race"? ![]() Quote:
Last edited by fantomaster : 10-17-2004 at 12:23 PM. Reason: The usual: typos removed etc. |
|
#10
|
|||
|
|||
|
>>I am not sure I understand your post
Yep, you did. My example was just less radical than showing a different page entirely. I assumed the page already had block content in the right areas and all you wanted to do was throw out some links from it.. >>arms race LMFAO! - very funny ![]() Nick |
|
#11
|
||||
|
||||
|
Quote:
Quote:
Now my head is spinning again with ideas. |
|
#12
|
||||
|
||||
|
Quote:
But if you propose just putting text ads embedded in the content of the middle text for both SEs and Web visitors (true contextual advertising, dynamically changing the text on the page to take the form of a link to an ad), then that can just tick off your readers and also tick the writer off. Last edited by rustybrick : 10-17-2004 at 12:17 PM. Reason: added several words to clarify |
|
#13
|
|||
|
|||
|
Quote:
Nick |
|
#14
|
|||
|
|||
|
Quote:
hat imagination run loose again. Boyo, will I have to pay a price in hell for that when the final reckoning's up ... ![]() And of course you're right - this does work well indeed. At least, don't see any reason why it shouldn't. Quote:
![]() |
|
#15
|
|||
|
|||
|
Quote:
I'll defer to Dirk to give you the nitty gritty of spider identification as he's our resident spider wizard. So you'll get it straight from the black hatted horse's mouth ... |
|
#16
|
|||
|
|||
|
Quote:
Using a foreach loop is fast enough. To speed it up you could build a hash where the keys are simply the IPs. So you only have to check if the key exists. The fastest way is a SQL solution. But I think this would only be necessary if you get more than 1 million hits per day. Dirk |
|
#17
|
|||
|
|||
|
Ah... thanks very much Dirk, that's lovely. I rarely if ever get an opportunity to talk about these things so just wanted to get that one irritating little doubt out of my mind
![]() I sincerly hope that block analysis comes sooner than later, it'll seriously knock a whole bunch of potential competitors out of the loop for some time. When im done with my current projects, i may very well give you guys a call, i can see that getting a little more serious about cloaking will be a major advantage next year... Cheers Nick |
|
#18
|
||||
|
||||
|
On second thought...If you really think of it (of course, I am not expert on cloaking), cloaking would be deployed for mostly the on-page elements? Getting links to these cloaked pages goes by way of pretty much 'spammy' techniques, since no natural linking can be conducted on pages invisible to Web users.
In such a case, cloaking deploys a form of redirection. So links within a page, no matter where they are found by the end user, if changed by form of cloaking (i.e. redirection), then those links are worth less - unless links are obtained to those cloaked page URLs. I know you can dynamically change the content of a page based on IP or user agent. But there goes the whole concept of ShadowDomains. I guess you can still benefit from cloaked pages, when thinking in the future about block level weights. However, it can/will cause more comment spam attempts in my blog. ![]() Any ideas? |
|
#19
|
|||
|
|||
|
Quote:
linkage being a playing field in its own right. But let's bear in mind that internal, on site linkage can be pretty important, too, so that's where the two will merge to some extent. (Yes, in case this can still surprise anyone: Google's PageRank algo is a much about on site links as it addresses external linkage.) Also, IMV focusing more or less exclusively on external linkage and off site criteria, while immensely popular these days, doesn't truly reflect real life SEO requirements ("rules", if you like). Content may not be "king" in the sense of an absolutist monarch any longer as it used to be for a very short period in the past, but after all it's what web sites are about. There's a hierarchy of generation to this - it's quite a trivial, obvious mechanism, really, but if SEOs ever lose sight of it they'll have hell to pay in the long run. First comes web sites' presentation of textual content. Second comes search engines' amassing databases by storing web sites' content in one way or another, be it snapshots of actual content presented, be it in a more abstract manner (mathematical formulae), be it both (the typical approach). Third comes search engines' processing all this data and offering searchers to find specific, targeted sections of content or, more exactly, the sites and pages on which it's located. This, of course, is a highly complex procedure and parameters for search optimization are being tweaked (for better or for worse) all the time to handle the sheer mass of data within a viable time frame and present (ideally) good, i. e. highly relevant results. Search engines have gone out of their way to compensate for webmasters' tricks to influence their ranking algorithms: keyword spamming, meta tag stuffing, invisible text, etc. etc. - they've all been more or less successfully demoted as viable SEO techniques in the course of the years. However, the one thing search engines can't (and will never) get around is that they are by their very definition strictly second tier ("parasites", really): FIRST comes the web sites' presentation ... (see above). So whatever their antics, be it resorting to linguistic analysis and information retrieval algos, be it link analysis, be it c-indices, techniques culled from artifical intelligence research, vector analysis, context, themes, network theory, whatever: ON SITE is still where the initial action trigger is, period. So if we include some elements not naturally associated with the "content" concept (which, in the naive sense, simply means "natural language strings"), such as: page architecture, on site cross linking (the spiders have got to find the content first before they can fiddle with analyzing its relevance, remember), keyword density and distribution (that's a language asset, true, but not necessarily a "natural language" one, so let's call it the "meta language" factor), meta data (think XML, for instance), embedded non-language data strings (graphics, multi media streams, etc.) - then (and only then!) can we reasonably say - again - that "content is king". And that, of course, is exactly where manipulation starts, whether it's tread-of- the-mill conventional "white hat" SEO or cloaking/IP delivery. (It may not be where it ends, of course - think link farms, PR based advertising, etc., but it's where the cloaking effort is essentially focused.) So while I'm certainly not denying the importance of off site criteria such as link popularity, "bad neighborhood" policy issues, etc., I'd strongly recommend not to fall for passing fads that are mainly conspicuous by their ignorance of the process's basics - such as the policy of more or less ignoring on site factors in favor of off site criteria so many SEOs currently seem to subscribe to. (Not saying that you did, of course - I wouldn't dare insult your intelligence and SEO expertise that way. But you'll probably agree that this tendency abounds elsewhere, and since a forum such as this one in general and a "101" thread in particular isn't just about experts talking shop within a public context, I'm not above restating the immensely obvious for those who may not yet be as familiar with the subject.) |
|
#20
|
|||
|
|||
|
Quote:
These are the most effective. They look and feel like a regular site but behind the sceen it could have 1,000's of pages you don't see that build the page mass and are used to optimize for as many themes as needed. ![]() |
![]() |
| Currently Active Users Viewing This Thread: 1 (0 members and 1 guests) | |
| Thread Tools | |
|
|