View Full Version : Google, filters and penalties
Marcia
12-13-2004, 05:23 AM
The discussion about whether or not there is such a thing as "filters" being applied by Google has come up several times, so it seems it would be good idea to examine the issue and get a clear picture about how filters operate.
It's come up again in a thread this week about the word "filter" being used:
About the only places you will see it used incorrectly is forums like this one.
Unoptimized pages ranking higher (http://forums.searchenginewatch.com/showthread.php?t=3244)
So it may help to take a look and see how someone from Google has used the expression filter. This is from a discussion about the expired domains filter that was being rolled out in Spring of 2003. Post from March, 2003 in a WebmasterWorld thread about the expired domains penalty (http://www.webmasterworld.com/forum3/10325-2-10.htm):
GoogleGuy in msg #16
Definitely send in detailed reports if you think you see a problem. The expired domain filter is in the process of rolling out, and won't be fully deployed for 2-3 months.
Hope this helps,
GoogleGuy
So there we have it; Google uses filters and that's what they call it. But how do filters work?
How I find it easiest to interpret the operation of a filter is to think of it being that what is "filtered" out is not the site itself altogether, as would happen if removed totally or banned from the index, but that an element for which some over-use or abuse is detected is filtered out and therefore will not be counted toward scoring for the site(s) affected - either partially or fully. That way, though the site remains in the index, it cannot rank as it would have had that element not been filtered out.
Anyone have any clarification or insights on how filters and/or penalties work?
Dave Hawley
12-13-2004, 06:33 AM
A filter will hide the page from being shown, or remove it entirely, i.e filter it out.
A "penalty" (not the right word IMO) would mean no credit is being passed for certain elements. If these "elements" were being included in any credit before, the page (not site) will likely fall. This is likely one (of many) of tings that happened with Florida. Million screamed "penalty", "I've been penalized.." etc but it is more likely a change in the algo.
However, one must always keep in mind that Google's index is constantly changing order due to hundreds, if not thousands, of other reasons.
Marcia
12-13-2004, 06:38 AM
A filter will hide the page from being shown, or remove it entirely, i.e filter it out.
Are you saying it will hide the page from being found in the index at all by domain, or for searches on keyphrases?
How would what you're saying work with an expired domain filter?
Dave Hawley
12-13-2004, 06:44 AM
Are you saying it will hide the page from being found in the index at all by domain, or for searches on keyphrases?Depends entirely on the filter criteria.
How would what you're saying work with an expired domain filter? I would say that an expired domain filter would remove the page(s) eventually and in the interim, hide it/them.
DaveN
12-13-2004, 06:47 AM
A "penalty" (not the right word IMO)
GoogleGuy Said
No worries, It's easy to make a mistake, and that's why we put an expiration on this set of penalties. You should be fine in a while.
that was when a site I was working on went for #2 to #52 .... it had a penatly on it !
DaveN
Dave Hawley
12-13-2004, 06:54 AM
Seems like all one needs to do is email/ask GG and there is no futher need for any forums/discussions.
I'm astounded that anyone would think a Google employee would be totally open honest about its way and means. Perhaps you should simply request the algo be made open source :rolleyes: Oh that's right, we only believe what we want to hear and GG would never tell us what we want to hear :) He's on the side of the SEO's isn't he :rolleyes:
DaveN
12-13-2004, 07:18 AM
Seems like all one needs to do is email/ask GG and there is no futher need for any forums/discussions.
Thats what the Big Boys do ;) and when the SE's have a problem they can speak to us on how to help them, back scratching i think it's called..
anyway can we keep this on track..
there are 2 main Penalties IMO
1) Automated
2) Hand Review
1) The automated ones are easy to overcome, work out what you did wrong correct it and leave alone
2) you will need to contact google, explain what was wrong and how you fixed the problems, common hand reviews ( cloaking, keyword stuffing.. etc )
DaveN
DaveN
12-13-2004, 08:00 AM
some google penalties
Banned
“Slow Death”
-30 ( ;) )
PR0
Guestbook / Links Pages
Site Has PageRank but will Not Pass it On
Prime Keyword Penalties ( oop )
Other Factors
Redirects, Duplicate Pages or very simalar Pages
DaveN
Dave Hawley
12-13-2004, 09:45 AM
DaveN, I believe the point of the discussion is "Anyone have any clarification or insights on how filters and/or penalties work?"Parroting commonly used terms/myths that contain the word "Penalty" is not very productive IMO. But then again, I'm not a self proclaimed Big Boy like you :rolleyes:
DaveN
12-13-2004, 01:10 PM
PR0 - well documented on the Internet, this is when Google adjusts your page rank too 0 leaving all the links in the link: in place , used when people are selling links... from that came the will not pass PR penalty much more discrete in think the first will not pass was the famous weather site ;)
DaveN
Marcia
12-13-2004, 01:16 PM
Parroting commonly used terms/myths that contain the word "Penalty" is not very productive IMO
What is the myth? Are you saying there is no such thing as a penalty?
Joseph Morin
12-13-2004, 01:35 PM
But then again, I'm not a self proclaimed Big Boy like you :rolleyes:
Then perhaps I should mention that DaveN IS one of the big boys :)
I definately pay attention when he speaks.
hugo guzman
12-13-2004, 01:48 PM
There's way too much ego driven discussion here...
Deciding who is and who isn't a big boy isn't going to do much of anything...
As for some of the assertions of "fact" on this thread...
PR0 - well documented on the Internet, this is when Google adjusts your page rank too 0 leaving all the links in the link: in place , used when people are selling links... from that came the will not pass PR penalty much more discrete in think the first will not pass was the famous weather site
-Neither of these assertions are based on documented fact. In addition, many "experts" and even some SE reps are incouraging the purchase of relevant links (including but not limited to paid inclusion directory listings) and are advising against tried and true methods such as reciprocal linking.
Newbies- be careful not to mistake fact with unsubstantiated theory...there are definitely various "filters" or "penalties" in place, but because of the nature of search engine technology it is easy for even the "experts" to get it wrong sometimes. Understanding the nature of search engines is a bit like deciphering a card trick. There's often more than meets the eye.
DaveN
12-13-2004, 02:08 PM
Hugo didn't you read all the case studies, on the searchking v google .. when google took SK's pagerank to Zero ... ???
Google freely admits that it demoted SearchKing's page ranks in response to SearchKing's actions..
they where selling pagerank from a pr8 site so google took it to a pr0 site
http://research.yale.edu/lawmeme/modules.php?name=News&file=article&sid=807
DaveN
hugo guzman
12-13-2004, 03:57 PM
...a perfect example of the card trick analogy!
The key to this particular case is that the domain in question WAS EXPLICITLY SELLING LINKS FOR THE PURPOSE OF IMPROVING PAGERANK.
They even made mention of it on their site.
That is a far cry from a static banner (or text link) on a relevant site.
If I have a site that deals with flower sales and I pay for a static banner or text link on a site for botany enthusiasts I will not be penalized in this way. Why?
Because the botany enthusiast site is not explicitly selling links for the purposes of increasing PR. They are selling advertising. The backlink that I will gain is simply a byproduct of this type of advertising. Another byproduct (you guessed it)...traffic!
The truth of the matter is this...webmasters routinely acquire backlinks for IMPLICIT purpose of increasing their pagerank and SERPs (that's what reciprocal links are for). The reason the reciprocal links are "acceptable" (by Google and other search engine's standards) is because they, in theory, help relevance because similar sites tend to link to one another. The other reason is because link exchanges "technically" are means of driving traffic to sites.
The only real difference between a text link ad and a reciprocal link is that the payment comes in the form of cash instead of a link.
The mere existence (not to mention the fact that they're thriving) of sites like Linkadage.com, text-link-ads.com, and other spin offs would hint at the fact that Google (and other engines) are not against text link advertising.
If they were, then the previously mentioned sites would all be banned, etc... and their wouldn't be so many legitimate "white hat" webmasters engaging in this form of promotion.
I'll go as far as to say that text advertising, when done in a responsible and legitimate manner (relevance is key) will supercede reciprocal linking as the prefered form of "acquiring" inbound links for the purposes of ranking well in the search engines.
I think that this idea might scare some members of the established seo community, who are entrenched in top positions for big keywords, but realize that relative newcomers may overcome them in the SERPs by utilizing this powerful form of seo promotion.
In the meantime, I'll keep working on increasing the amount of fresh content that my sites can pump out per day. As far as I'm concerned, content is the only tried and true method for building a site that will stand the test of time (and make a lot of money doing it!).
Dave Hawley
12-13-2004, 07:41 PM
I definately pay attention when he speaks. I would too if he didn't blow his own trumpet so much.
Dave Hawley
12-14-2004, 12:18 AM
What is the myth? Are you saying there is no such thing as a penalty? The myths are all the explanations that usually go along with such terms. One doesn't have to read too much about SEO to see that the "professionals" and "experts" (Big Boys :) contradict each other on many aspects of "filters" and "penalties". They cannot all be true.
These myths are often perpetuated by those that have a vested interest in them. That is, they have it written on their site, they have written about it in books and/or have/continue to charge customers for it.
To me the words "penalty" and "filter" have only one true meaning. Just because the words are being used incorrectly over and over it doesn't change their true meanings.
In the meantime, I'll keep working on increasing the amount of fresh content that my sites can pump out per day. As far as I'm concerned, content is the only tried and true method for building a site that will stand the test of time (and make a lot of money doing it!). Aint the truth!You wont find this fact being perpetuated compared to the more profitable myths.
glengara
12-14-2004, 07:01 AM
* “Slow Death”, Site Has PageRank but will Not Pass it On*
DaveN, any insights in the causations for these two particular ones?
DaveN
12-14-2004, 07:18 AM
glengara ,check your PM sent of some examples,
I, Brian
12-14-2004, 08:10 AM
*Site Has PageRank but will Not Pass it On*
The open source forum project www.phpbb.com (http://www.phpbb.com) was a pretty famous example of this. They were taking on advertising based on the site's high PageRank, to help fund the project. Google then took action to prevent phpbb.com's pages passing PR to the advertisers.
glengara
12-14-2004, 08:55 AM
*Site Has PageRank but will Not Pass it On*
I asked about that as I seem to be coming across increasing examples of PR/anchor text not being passed on, but as there is no mention of PR, much less an explicit selling of it......
SEbasic
12-14-2004, 10:55 AM
I asked about that as I seem to be coming across increasing examples of PR/anchor text not being passed on, but as there is no mention of PR, much less an explicit selling of it......Could you PM me examples of these sites please.
I have seen sites where the PR is not passed on, but only when they have been selling PR as opposed to "advertising space".
What are the rules here in terms of what can and can't be done by Google?
Are there any?
If I had advertising ruinning on my sites, then the PR was taken away as a result of that advertising, I would certianly not be best pleased.
hugo guzman
12-14-2004, 01:51 PM
Here's an interesting little tidbit...
textlinkbrokers.com used to publish a list of sites that supposedly had their PR "blocked". They put this list on a domain named blockedpr.com
They essentially assured folks that there was indisputable proof that these sites had their PR blocked. I have always maintained that this is not the case, and that there are other (more legitimate) reasons for why a particular link may not increase the PR of another site (such as too many other outbound links draining the web pages ability to pass PR).
However, this domain became the bible for folks that preached the "blocked pr" gospel.
Low and behold, if you visit that domain now, you will find that the list of sites is gone and the authors have retracted their assertions until they can find more substantial "proof".
I find this little incident extremely interesting for several reasons:
1)One of the original proponents of the "blocked pr" theory has had to retract their statements due to lack of evidence
2)This organization (textlinkbrokers) engages in the buying and selling of text links, and originally came up with this list as means of making their service more reputable (i.e. are links don't have their PR blocked).
The real question here is "Why would an organization that sells text ads propogate the theory that google blocks PR passing on some sites?" The answer is because as long as text link advertising is done in a discrete and responsible manner and is explicitly and specifically intended to increase PR, there is nothing wrong with it (in Google's eyes).
That's why site's like textlinkbrokers.com and linkadage.com still exist and are thriving...forgive me for being redundant and restating the obvious.
DaveN
12-14-2004, 01:55 PM
yer i guess links don't hurt you after all and google would not add a penalty to the site because it's still in the index.
DaveN
Marcia
12-14-2004, 02:18 PM
The real question here is "Why would an organization that sells text ads propogate the theory that google blocks PR passing on some sites?"
The answer is because as long as text link advertising is done in a discrete and responsible manner and is explicitly and specifically intended to increase PR, there is nothing wrong with it (in Google's eyes).
We really can't know how things look in Google's eyes except by their public actions.
But a pretty universal answer as to why any statements are made by those engaged in marketing something is that whatever works to make you look better than competitors is valid to communicate. If they were passing PR and competitors weren't, then that's a possible "why" and a good one.
So now the questions arise - why it was pulled and are they still passing PR?
hugo guzman
12-14-2004, 03:01 PM
So now the questions arise - why it was pulled and are they still passing PR?
Marcia,
The first question has already been answered...they pulled it because they came to the realization that they could not substantiate the claim that the sites on their list did not pass PR...they didn't want to publish a list that would be considered inaccurate...I applaud them for that.
The second question is a bit of a mute point...they are text link brokers...they broker links from sites that are not on their domain (i.e. broker). Those domains (as well as the ones being offered on linkadage and other similar sites) do pass PR etc...Passing PR or giving a SERPs boost is just a nice byproduct of a static ad (whether its a text link or a banner). That's what makes this relatively new form of advertising so attractive. It's like a reciprocal link on steroids. You get more traffic, on better content pages, and less outbound links, and you don't have to link back.
The key (as with any form of SEO promotion) is to be responsible. Don't purchase ads on unrelated sites, don't use the same exact anchor text on every ad, don't link to the same exact url every time, and above all don't treat this form of advertising as a "get rich quick" scheme to artificially boost your PR and SERPs. Text link advertising is not a magic pill that will automatically insure the success of your website. It should be treated as just another facet of your overall seo promotion strategy.
glengara
12-14-2004, 04:21 PM
Somewhat surprisingly, I couldn't agree more Hugo ;-)
Trouble is, few bother to do it properly, which is where the problems arise.
*Could you PM me examples of these sites please.*
Here's an example of the links not "working" in the public domain, bluefind.com.
This is an SEO-inspired directory that " text link advertised" from the get-go, achieving within a short time a home page PR 8, which of course encouraged people to submit to it.
On G today it's not in the first 100 for its main KW "web directory", though showing 12,800 links on G, 93, 500 links on Y!, and not just "ordinary" links mind, the majority include targeted link text .
In comparison, at no.30 on G for "web directory" I have hotvsnot.com, with G showing 4190 links, and Y! 16,700.
So what happened to the 76,800 MIA links/anchor text for Bluefind?
I, Brian
12-14-2004, 05:26 PM
Jugo, I'm curious - what do you think of the phpbb.com situation - was there really a block on it passing PR, or would you suggest that it was misinterpreted? Or if there was a block, that it was a very rare exception?
hugo guzman
12-14-2004, 06:14 PM
That's funny! I've got some buddies down here in Miami that call me Jugo ("Jug-O").
To be honest I'm not sure. Here is what I do know:
It seems like the newest text link advertiser on the homepage is http://ppcuniverse.com. They only have 23 backlinks in Google (phpbb doesn't show up) and 71 backlinks in Yahoo (phpbb does show up).
They have a PR of 5, but since the last PR update came on October 7th and there was a "lag" of 1 to 10 weeks (by "lag" I mean the difference between when the PRupdate occured and when new backlinks stopped being counted towards the PR update), it is possible that they have not received any PR "credit" yet for their backlinks from phpbb (unless they purchased the advertising prior to July of this year).
Another factor to consider is that the homepage for phpbb has roughly 60 outbound links. I checked this using the Webmasterworld googlebot simulator:
http://www.searchengineworld.com/cgi-bin/sim_spider.cgi
60 outbound links would considerably "drain" the PR passing ability of their homepage.
The important thing to me is whether or not the sites have acquired text link ads on the homepage are benefiting from a SERPs boost (regardless of their "visual" PR ranking). The 5th advertiser, which is a gambling site, (some of their phpbb backlinks do show up in Google's backlink data) ranks #1 for their main target search term "online sportsbook" and #5 for the keyword "sportsbook".
Now those rankings could be attributed to the other (1000s) of potent backlinks from other domains, but I will tell you this, those links from phpbb definitely ain't hurting them!!!
We've already had a heated discussion about whether or not inbound links can hurt your SERPs, etc...so I'm not going to get into that again, but I think that the proof is in the pudding on this one.
I hope this helps...
kirkvan
12-14-2004, 06:53 PM
So what happened to the 76,800 MIA links/anchor text for Bluefind?
I'd also love to hear theories/answers to the bluefind.com page drop mystery. (At least it's currently a mystery to me.) I really liked the directory and purchased some listings there.
Aloha,
Kirk
Marcia
12-14-2004, 07:32 PM
kirkvan, we're not discussing a specific, that's beyond the scope of talking about filters, penalties and how they work.
Revisiting the expired domains penalty referred to earlier, which established the fact there is such a thing as penalties, we have a clue here from the same discussion on how they work. In response to this question:
Will links be counted that are gained AFTER a site is re-registered (expired domain)?
Ex. I register an expired domain today. Tomorrow (or 2 weeks from now) I start trying to get appropriate links. Will those links count JUST AS IF THE DOMAIN WAS NEW(not previously registered)?
We have the response:
GoogleGuy
Sure, Loki99. After the rollout is complete (I expect 2-3 months), links made after a domain expired will be counted. That applies retroactively too, so if someone registered an expired domain 4 months ago, all links less than 4 months old would count toward PageRank. However, until the rollout is fully completed, I wouldn't be surprised if not all links are counted/reported for expired domains. It's a byproduct of the rollout method to see that behavior. Hope that helps explain things, but the short answer to your question is "yes".
So links prior to an expired domain's acquisition were not to be "counted" or "included" and were therefore missing for benefit for the expired domain that had been purchased. The older links were filtered out but the sites did not disappear, nor did their currently acquired links obtained after the cut-off period.
I interpret that to mean that it's the excluded factor that "disappears" and no longer counts toward scoring; it's not the site itself that's removed. If there's another reason for the site to be excluded from the index that's another matter - but it's a different issue from the penalty that explanation relates to.
That's how I interpret how a penalty functions - something is removed, or filtered out, that would otherwise have helped the site rank better. I could be wrong, but haven't heard a better explanation that gives clearer understanding. It's nothing arcane, but perhaps there is a more succinct way to see it.
With the firmly established PR0 penalty, same principle as in the case of links not being counted for expired domains because links to expired domains were the abused factor, I think it's safe to assume that since PR abuse of some kind was the offending factor, then PR was omitted for counting toward ranking for the penalized sites for those that got hit with the PR0 penalty. The sites were not dumped from the index, but without any PR they couldn't get listed on their own refrigerator doors without a magnet.
glengara
12-14-2004, 07:53 PM
* ..that's beyond the scope of talking about filters, penalties and how they work.*
Don't quite follow, would investigating this specific not throw light on the general?
kirkvan
12-14-2004, 09:23 PM
Aloha, Marcia ~
I respect your role in moderating--keeping things on track. It just seemed to me that including a specific case study of a penalized/filtered site might enrich the conversation.
Best,
Kirk out.
hugo guzman
12-14-2004, 09:34 PM
I agree with you Kirk, and I appreciate your contribution.
Give me some time to analyze that example. I'll see what I can come up with...
Marcia
12-14-2004, 10:11 PM
Aloha, Marcia ~
I respect your role in moderating--keeping things on track. It just seemed to me that including a specific case study of a penalized/filtered site might enrich the conversation.
Best,
Kirk out.
Thank you Kirk, and it will stay on track.
hugo guzman
I agree with you Kirk, and I appreciate your contribution.
Give me some time to analyze that example. I'll see what I can come up with...
Thank you too, hugo. We look forward to seeing the new thread you will be starting on a new topic.
If we are going to have a better understanding of filters and penalties then we should think about the mechanisms by which they are applied.
Filters or penalties are often thought of as a byproduct of a search engines algorithim, but I believe you have to define algorithm as the mathmatical/logical function that is used at ranking time to determine the relevancy of a set of pages to a particular search query, and I am of the opinion that these filters or penalties are not applied at ranking time and thus are not a part of the algorithim.
As Marcia mentioned, there are two sorts of filters or penalties, hand and automated. The way that hand penalties are placed seems rather obvious
It seems to me that it is neither feasible or logical to apply the automated filters at ranking time and that it is much more likely that these automated filters are programs which are run over the search engines index as and when needed.
There have been several examples of this such as the hidden text blitz which penalized a bunch of pages (but which does not seem to have been run too often lately if you judge from the amount of hidden text still to be found) and the dulicate pages blitz of a few months ago.
Opinions?
JohnScott
12-14-2004, 11:03 PM
So what happened to the 76,800 MIA links/anchor text for Bluefind?
I really have no clue what happened to BlueFind. I somehow doubt it's a manual-review penalty. I've seen a lot of those, even some in the recent PR update, but they usually involve a domain-wide PR drop. BlueFind is still showing PR8.
I suspect it has to do with one of two things. First thing that comes to mind is that BlueFind had some ROS links. They weren't the best quality links. That on top of the fact that BlueFind may be linking to penalized websites (when we review submissions, PageRank isn't considered at all, so I don't have any idea how many penalized websites we may be linking to).
The second thing that comes to mind is a duplicate content filter. SevenSeek started out using roughly half of the same category structure as BlueFind. Since then both directories have changed in many ways and I'd think that less than 25% of the categories are the same, but it may have been close enough to trigger a filter.
All we can do is do our best. We do not control Google. We do our thing; Google does theirs.
I doubt it is a long term penalty. I'm guessing it's just a snafu, but who knows.
WilliamC
12-15-2004, 04:15 AM
It was a manual review penalty. It happened last weekend along with with some other directories that were on the short list for selling placement for PR purposes. As a matter of fact it was posted at webworkshop as soon as it started with all of johns listed pages slowly losing their cache and description, and ended with almost every page in bluefind being placed in the supplemental index. From what I hear, it does not just stop there either.
Going to be a fun ride :)
Dave Hawley
12-15-2004, 04:26 AM
Oh dear! I guess that means SevenSeek, Uncover the net and of course Text-link-Ads are in Google's sites.
BTW, didn't Text-link-Ads homepage use to be PR7 or something? It's at PR 0 now.
WilliamC
12-15-2004, 04:31 AM
I still see it at 7
Dave Hawley
12-15-2004, 04:36 AM
Then something is likely cooking as I have tried a few times now and always get zero.
Let's face it though, it's a blantant selling of PR and Google don't like that one bit! If it's not hit today...it's likely only a matter of time.
glengara
12-15-2004, 05:37 AM
*..it's a blatant selling of PR and Google don't like that one bit!*
Was it blatant though?
I may be wrong, but it seemed more nudge-nudge, wink-wink with little if any mention of the P word itself.
In other words, in the buying/selling links area, G may be presuming intent, which could make things quite interesting.
Dave Hawley
12-15-2004, 05:50 AM
The mere fact the the price goes up with PR shouts something to me
This is some of what I read as blatant selling of PR.
"our unique program that will drive traffic and link popularity to your website."
"Our ads have raised our client’s search engine rankings because all of our text link ads are static html links that are picked up by the major search engines as a link back to your website"
"Link popularity is a major factor in top search engine rankings. Securing links from top websites can be difficult but our program makes it simple."
"Text Links By PR"
All ads show the PR of the site your buy you link on.
"A clean PR 7 computer themed site!"
From FAQ
"Is the PR of the ad I purchase guaranteed?
PR is guaranteed at time of placement only. If there is a PR update during your pay period (up or down) the price will remain the same until the start of the next billing period."
"Will these ads help my search engine rankings?
Aggressive link popularity campaigns are still the key to top search engine rankings. Our clients have experienced outstanding results in terms of increased traffic, link popularity and search engine rankings."
"Search engines view a static link to your website as a “vote” of popularity for your website. The common term for this is link popularity. Simply put, link popularity is a measure of the quality and or quantity of web pages linking BACK to your webpage"
glengara
12-15-2004, 06:08 AM
Better get back to the general...
*..neither feasible or logical to apply the automated filters at ranking time..*
When is the ranking time?
Since most filters would have a bearing on ranking, it would strike me as the ideal time, easier to filter a jug of water than the whole lake...
Dave Hawley
12-15-2004, 06:14 AM
2 words I once placed together in a post on WebProWorld. Funny how these things perpetuate isn't it :)
DaveN
12-15-2004, 06:25 AM
but google do have pre and post algo filters,
they used geo targetting and then filtered out Stormfront to german and french ip's, before removing them completely from the .de and .fr index
DaveN
Better get back to the general...
*..neither feasible or logical to apply the automated filters at ranking time..*
When is the ranking time?
Since most filters would have a bearing on ranking, it would strike me as the ideal time, easier to filter a jug of water than the whole lake...
First of all I am assuming that pages are not pre-ranked and that no one will disagree with that concept, therefor I consider that ranking time is when you make a search for a keyword and Google return a list of ranked results, i.e. at the time they rank the page.
If you look at the data available to Google when they do the rankings (the contents of the inverted barrels) it seems to me that:
The kind of data needed to make a judgement on the things that filters and penalties are triggered by are not available at ranking time.
If you were to try to determine somehow the effect of the various filters and penalties which Google have access to IMO it would slow down the search process to the point that it would be unacceptable.
In addition to the above considerations rankings are very specific to the search terms, and to the best of my knowledge filters and penalties are not.
DaveN : A good example of a post algo filter that we can get information on is the second Google duplicate results filter which operates after the pool of pages has been ranked and the SERPs results are compared to weed out duplicate pages based on the SERPs before the ranking results are shown to the searcher.
glengara
12-15-2004, 08:04 AM
*the contents of the inverted barrels*
Fell asleep in that class, assuming these contain the various ranking elements (PR, document relevance score, anchor text?) it would seem to make more sense to have the elements pre-filtered.
LOL Glengara you probably forgot more than I know, as I recall seeing you posting around the forums for at least the past four years.
At any rate the inverted barrels only contain hitlists with the scores for hits for a particular word on a particular page. Some of this is anchor text, some is page titles and some is onpage stuff, but it is not the entire contents of the page and does not contain PR information (which is added in the next stage). To a certain extent this is precomputed in that there is a score for each hit which varies with the type of hit, etc.
But the point is that all you have to work with at ranking time is these abbreviated barrels, and not the full contents of the 40,000 or so pages selected for ranking.
redandwhite
12-15-2004, 09:52 AM
DaveN, what do mean by "Redirects" ? Is Google penalising redirects, and if so which type exactly?
ogletree
12-15-2004, 03:07 PM
A filter affects only certain searches a penalty affects all searches. A ban is when your site is completly removed from the index. You can tell this by the fact that no serch no matter how obscure will not bring up yoru site including site:domain.com. G also does dimished values. Like 100 links from one site have less value than 100 links from 100 sites. They have value just less value.
The main filter that most people know about is the florida/hilltop/sandbox/overop filter. There are many names for it. There is no question that G treats certain serarches differently. G also has a dupe filter.
Marcia
12-16-2004, 04:47 AM
ogletree:
A filter affects only certain searches a penalty affects all searches. A ban is when your site is completly removed from the index. You can
Thanks, ogletree - big help there in clarifying the difference in the scope of effect.
Following two quotes taken from this other discussion:
http://forums.searchenginewatch.com/showthread.php?t=3244&page=2
DaveN, from the other thread:
Quote:
If Google 'took out' all of which they consider spam and which artificially boost rankings, they would not be left with a very large database of pages to choose from.
Yep thats why they use filters
added
filter : A program that processes an input data stream into an output data stream in some well-defined way
Googles Document Servers -> "filter" -> GWS
DaveN
Dave Hawley, from the other thread, responding to DaveN:
Quote:
filter : A program that processes an input data stream into an output data stream in some well-defined way
Quoted in part from: http://www.hyperdictionary.com/dictionary/filter
The missing part of the quote is: and does no I/O to anywhere else except possibly on error conditions; With I/O being Input/Output
So as I keep saying.....A filter will remove or hide/show based on a criterion.
I'm no "search engineer" or IR scientist by any means, but as I understand it words can be used in a slightly different manner within the context of different environments.
It's my understanding that the term filter isn't used quite the same way in Information Retrieval as the way filters operate when they are used, for example, in filtering out patients in a medical environment for a study, or in filtering out spam email so it won't be delivered to the recipient at all. In those other cases the filtering out process totally eliminates. It's a fine point to be sure, but it's an important one - at least to help me understand.
In a sense they can "filter out" because when a condition is met pages can be selectively filtered out during processing time from appearing for certain search terms, for example - but that won't necessarily exclude a site from a search index altogether the same way unsolicited mail is excluded from reaching its intended destination.
I've seen it happen (unfortunately) and know the effect, but can someone please correct me if I'm wrong, as to exactly how the word applies in an IR environment.
This is something that needs clearing up for comprehension - the timing of the application of filters. I don't know how else to put it.
Marcia I really don' t think that there is a particular timing within which a filter has to operate, I think it has more to do with what the filter is designed to achieve which determines when it is best to run it.
As an example Google has two duplicate page filters, one of them operates on the SERPs after they have been processed for rankings, filtering out pages with more or less duplicate SERP titles and descriptions before delivering the results to users.
The other appears to first of all look at all the pages in the index, and assign "fingerprints" to both the page as a whole and portions of the page, so this is obviously done prior to ranking time, and then Google may choose to apply either a filter or a penalty to pages which it deems as the duplicate. In the past we have seen certain periods of time when this filter seems to have been run and lots of duplicate pages are affected, but it does not seem to be even on a schduled basis.
Dave Hawley
12-16-2004, 07:02 AM
The most common misuse of the word "filter" IMO, is when users say things like, 'I used to be on page 1 for <Search Term Here> now I'm on page x. Looks like I've copped some sort of a filter'.
A filter will not send a web page to another SERP, it will either hide it completely, or filter it out. Anything in-between means the "filter" has failed for whatever reason.
A good easy to understand filter is Google's Adult content filter. It attempts to stop adult content pages showing, it makes NO attempt to move them to page x, it simply stops the pages being shown in the SERP's.
It is possible that the Link: command also has some sort of filter to not show pages that meet a specified criterion.
DaveN
12-16-2004, 08:34 AM
Dave, I think that the problem isn't the what is and what is not a "Filter" but the way it is used in the search engine industry...
example SPAM .... is UCE, Google decided to coin the phrase and use it has SE Spam, but joe public think that when i say "I spam google" they think I send Email
Filter and Penalties :
My spin on this is that If you do something google doesn't like they penalise (impose a penalty on; inflict punishment on) you in some form, whether it's your banned all together, or they add a certain filter, like you pointed out adult filter, if i run an adult site they mark it as adult, the public can decide whether or not to show those results as allinurl: allintext: link: site: and many more.. but i'm sure that they have filters of other things... one that i know yahoo do is that we can ban a site for the serps on a keyword, I'm pretty sure google can too, do you remember how you could tell when google hand banned people well they changed that to still show the url. so the filter allowed the URL but nothing else
if you have ever worked on the google API ..
Filter :
Activates or deactivates automatic results filtering, which hides very similar results and results that all come from the same Web host. Filtering tends to improve the end user experience on Google, but for your application you may prefer to turn it off.
so again Google use the word filter :
Google use the terms Penalty and Filter in reference to the SE and we should not try to confuse webmasters by calling them something else.. imo
DaveN
Nice post DaveN, I admire your restraint.
I look at it this way, a penalty is something that needs lifting is sort of off or on. A filter is something that is adjustable, more like a slider.
DaveN
12-16-2004, 12:37 PM
Nffc i see it the other way lol a filter is either on or off
adult filter on / off
now a penalty in my eyes is when :
when the index servers consult an inverted index and maps each query word to a matching list of documents and creates a hit list, Then the index servers determines a set of relevant documents by intersecting the hit list of the Individual query words, and then they compute a relevance score for each document. This relevance score determines the order of results in the serps. before passing the results to the GWS that's when a penalty is added imo reducing the page score, before producing the html of the serps
DaveN
ogletree
12-16-2004, 12:40 PM
I while back during the Esmeralda GoogleGuy posted (http://www.webmasterworld.com/forum3/14443.htm) about filters and mentioned a Kalman Filter. He talked about knobs and how things move around. I found a good explanation of the Kalman Filter on another site (http://www.innovatia.com/software/papers/kalman.htm)
So it seems that G uses the word filter to mean things moving around and that they dissapear. I could be wrong but that is how I see it.
Many ways to see a filter is that your site does not show up for a normal search but does for an allin search or the &filter=0. The most common filter is that they will only show 2 results from one site on one page. You can use the &filter=0 to get past that.
I am not a Finesse seo person. If you use brute foruce you don't have to worry about that stuff too much. All I worry about are outright bans.
>adult filter on / off
Nah, thats a switch. ;)
Dave Hawley
12-16-2004, 08:48 PM
Dave, I think that the problem isn't the what is and what is not a "Filter" but the way it is used in the search engine industry...
By definition, it can only be used to "filter". It would appear then that you basically agree with my definition of a Filter? That is, it will show, hide or remove based on a criterion.
A filter is something that is adjustable, more like a slider
Only the critiria would be adjustable. The Filter itself is on or off.
This Thread (http://forums.searchenginewatch.com/showthread.php?t=3279 ) seems to indicate that Google has placed a penalty of a drop in two PR points on the home page only on a site which has linked to bad neighborhoods, which is something new to me.
Granted that it has been a year ago but I removed a penalty for bad linking from a site and at that time the entire site went to PR0.
DaveN
12-17-2004, 04:08 AM
they can also ban a home page or remove it, and leave all the internal pages intact... seen that many times
DaveN
glengara
12-17-2004, 05:12 AM
*penalty of a drop in two PR points on the home page only*
That's not much of a penalty is it?
In theory it mightn't even change its position in the SERPs unless they also discounted links/anchor text.
BTW Mel, that thread link doesn't seem to lead anywhere.
Ooopps wrong URL I will have to change it thanks Glengara, Now that the correct URL is in place you can see that in addition to the PR drop the site also stopped ranking for all his keywords and the site pages seemed to be heading into the supplemental index.
glengara
12-17-2004, 07:57 AM
*in addition to the PR drop the site also stopped ranking for all his keywords and the site pages seemed to be heading into the supplemental index.*
Now that's more like it ;-)
glengara
12-17-2004, 11:52 AM
I suppose you can filter-in as well as out?
Hypothetically, Google could have decided a couple of years ago to widen the scope of determining the link topicality from simply link anchor text, to including say KWS in Urls or filenames in their allinanchor calculations.
If they had, I wonder where that might have led ;-)
orion
12-18-2004, 12:20 PM
A filter affects only certain searches a penalty affects all searches. A ban is when your site is completly removed from the index.
I’m inclined to agree with ogletree’s differentiation of the scope of filters, penalties and bans when applied to web pages and searches. Well put, ogletree.
In a non technical sense, filters remove or ignore something from something. Penalties does not necessarily involve removal or ignoring. They do require accountability and evaluation.
The general idea is that filters remove noisy items from a group of items. The group of items may/may not share commonalities.
Thus, a filter is a remover while a penalty involves evaluation and accountability of something. This something are items.
Items can be objects, numbers, text, people, ideas, data, a signal, a response, functions, etc
Evaluation and accountability could involve points, weights, scorings, actions to be taken, etc.
Both have consequences. The expression “penalty filter” involves both; that a filter is in place and that it involves evaluation and accountability.
The expression “penalty function” is a special class of penalties.
All filters, penalties, penalty filters and penalty functions are meant to correct something.
DIFFERENTIATING THIS STUFF
The difference between the two is clear.
In IR, I prefer to use the expression
--“a stop words filter”
Than
--“a stop words penalty”
In programming, instead of saying this
--“a find-and-replace penalty”
we say this
--“a find-and-replace filter”
In many sports, instead of saying this
--“he got a filter”
--“he was filtered out”
we say this
--“he got a penalty”
--“he was penalized”
(and misbehaviors or many penalties could amount to a ban)
In some instances the expression “penalty score” is more sounded than “filter score”.
Now about Google’s and other search engine’s “filters”, I have a collection of such “secret” filters, debunked. Few of these are real, but most of the one we read on the Web exist only in the mind of netizens :)
Orion
AussieWebmaster
12-18-2004, 01:37 PM
It was a manual review penalty. It happened last weekend along with with some other directories that were on the short list for selling placement for PR purposes. As a matter of fact it was posted at webworkshop as soon as it started with all of johns listed pages slowly losing their cache and description, and ended with almost every page in bluefind being placed in the supplemental index. From what I hear, it does not just stop there either.
Going to be a fun ride :)
I have BlueFind at 8 also....
AussieWebmaster
12-18-2004, 01:41 PM
The listings for web directory are really a poor example on Google's part... there are all sorts of small shopping directories and even sites that merely refernce directories in the first 2 pages and DMOZ comes in at 23rd!!!!
Marcia
12-19-2004, 04:36 AM
DMOZ comes in at 23rd
DMOZ isn't "optimized" to rank well. Maybe if people linked to specific categories using the right anchor text they would do better. ;)
ThouShaltSeo
12-19-2004, 09:45 PM
being on page 13 is just as good as ban, isn't it?
If you get zero users because of a Google 302 handling error yet you do great with &filter=0, a ban and a filter are essentially the same. Call it whatever you want, the end result is the same.
Actually, a ban would be better because you can clean up your act and ask to be re-included. Now, there's nothing we can do since chasing webmasters from Romania or Latvia is not that productive.
All I worry about are outright bans.
There does seem to be something funny going on at Bluefind.com. A site:www.bluefind.com search returns only 22 pages out of 19,700 before you have to click the
In order to show you the most relevant results, we have omitted some entries very similar to the 22 already displayed.
If you like, you can repeat the search with the omitted results included. link and at that all the pages that I have found so far seem to listed as supplemental pages.
Marcia
12-21-2004, 04:33 AM
Mel, it came up in discussion about a month ago
http://forums.searchenginewatch.com/showthread.php?p=24631
Here's another thread that was started on it, trying to look into it further:
Google duplicate content, internal linking and topical content issues (http://forums.searchenginewatch.com/showthread.php?t=3004)
seobook
12-21-2004, 04:41 AM
Maybe if people linked to specific categories using the right anchor text they would do better. ;)
<a href="http://www.dmoz.org/.../topic">Corrupt Topic</a>
http://www.corruptdmozeditor.com/ :D
Marcia
12-21-2004, 05:04 AM
Funny, but I was being tongue in cheek about it. :)
The point is, that some sites are obviously SEO'd and some aren't (like DMOZ). If there's ever human review, I would imagine the degree of SEO that's evident could make a difference.
Mel, I'll try to dig out that research paper I came across, it may help shed some light.