View Full Version : MSN Says It Will Stop Search Spam -- Will It?
dannysullivan
06-10-2004, 02:14 PM
Saw this article mentioned (http://www.searchenginelowdown.com/2004/06/microsoft-claims-new-msn-search-will.html) in Andy Beal's blog: MSN Search claims to freeze out web spam (http://www.pcw.co.uk/News/1155758).
We'll see. Sadly, I think spam will always be a part of crawler-based results. Google's known famously for once having said they didn't need to worry about spam, when they started out. Today, it's a major problem for them -- just like everyone else.
I set up a poll in this thread, in case you want to vote on whether MSN will stop search spam in its tracks.
rustybrick
06-10-2004, 02:22 PM
That is funny, I think Paul Gardi or someone from Teoma at the NYC SES conference said that they don't really need to worry about spam either.
If MSN pulls it off, then they win. I doubt they can. No one solved the email spam problem, how can they solve search spam? People are happy with 90% spam filtering. 100% spam filtering, IMO, will never happen. (never say never)
St0n3y
06-10-2004, 02:30 PM
Pretty vague. do they distinguish from legit and non-legit pages that try and get keyword links pointing to it? I fear that if spam filtering goes to far then lots of legit stuff will go along with it. Imagine if our air filters did that. Get rid of all the junk in the air, but OH, a lot of Oxygen doesn't get through either. :)
David Wallace
06-10-2004, 02:55 PM
I think the only way any search engine can effectively stop spam is to have an empty index! :D
Alavina
06-10-2004, 04:51 PM
I think the only way any search engine can effectively stop spam is to have an empty index! :D
A bit less radical is a human edited index...
Back to the topic: why should MSN be any better than the others? Get rid of spam, you mean like they did it with spam mail on hotmail? :p
Nacho
06-10-2004, 05:07 PM
It's all about human resources, and all search engines have enough geniuses to figure them out. I don't think MSN has a better chance. It's like poker, everyone has a chance to beat the odds, but they are all in the same battle.
>Sadly, I think spam will always be a part of crawler-based results
Then why be sad, it's a part of the search landscape, accept it, learn to love it and move on with your life ;)
IMHO the problem with "spam" is real simple, the search engines fail to accept that people will try and "game" the system, thats a fault of their shortsighted view and not a fault of the www or the webmasters who control it.
MSN Search may have more of a chance of "stopping spam" than all those who have failed before, at least they seem to be crawling and indexing the real web. Testing a new search engine on the stanford web isn't really a good preperation for the real world.
I think its time we have to stop cutting slack on those Scooby Doo Engines [we would have gotten away with it if it wasn't for those pesky kids]. Its their algo, they choose to make money from our content, let them take the algo and financial costs, I don't recall getting the offer of a cut of the many IPO deals past and present.
My tolerence for what some call search engine spam is real simple, if it ever excedes what I see in my home towns offline Yellow Pages [I'm a UK guy] then I quit the game, so far it hasn't even come close.
But the short answer is No, of course they won't.
K.S. Katz
06-10-2004, 06:19 PM
It's a pretty bold statement considering on how bad MSN's spam filters are for email. I get more spam in my hotmail address than all of my other addresses combined.
That is funny, I think Paul Gardi or someone from Teoma at the NYC SES conference said that they don't really need to worry about spam either.
The reason why Teoma and some of the other search engines have less of an issue with spam than Google is because of the volume of traffic. The more search, the more likely your index is targeted for spam.
Spam is here to stay. :(
Chris_D
06-11-2004, 09:45 AM
whether MSN will stop search spam
More sizzle and FUD from the 'gunna' of Redmond.
One day, I'd just love a journalist to look Gates & Balmer in the eyes and point out that they'd actually need to have an operational, launched search engine to make this story even half interesting......
"One day - I'm gunna....." well - that's the day I'd like to talk about it.
Sizzle and FUD. Gunna. Microsoft's been 'gunna' get serious about search since 1998..... its time to just show me the steak - and stop telling me about the sizzle!! :)
pleeker
06-11-2004, 02:24 PM
They'll come up with some convoluted, Redmond-speak definition of "search engine spam" and use it to support their pre-launch claims.
Daria_Goetsch
06-11-2004, 02:37 PM
Since they tend to use other people's ideas to build products, I don't think they'll come up with something original to fight spam.
dannysullivan
06-14-2004, 01:40 PM
You can read the actual research here (http://research.microsoft.com/research/sv/PageTurner/), thanks to a link found (http://www.webmasterworld.com/forum97/88.htm) by msgraph over at WebmasterWorld.com.
Look in the Further Reading section for the article/paper called, Spam, Damn Spam, and Statistics: Using Statistical Analysis to Locate Spam Web Pages.
Having looked through it, it focuses on finding machine-generated spam through statistical analysis, discovering pages that all seem to fall in a particular type of group.
Really didn't feel like anything groundbreaking, and there are types of spam it probably won't stop. But the paper itself doesn't actually say it will stop spam in its tracks, only that the research would be useful in finding this particular type.
andrewgoodman
06-14-2004, 03:46 PM
I think we'd all agree that if you adopt paid inclusion of a certain type, you can either reduce spam greatly or at least reduce it a fair bit, and then redefine the rest of the spam in the index as "violating the TOS" of the paid submission and vow to punish those responsible.
But that's just it. If you change the degree of difficulty, or change the terms of the debate, you can always claim to be doing something like "stopping spam."
As someone else mentioned, the best way to stop spam is to employ professional human editors. Even this has its drawbacks, as we've seen.
Now a highly bureaucratic system of "belonging" to a new search index that was only open to accredited businesses paying a very high annual fee and subject to rigorous background checks and... etc.... would reduce spam almost to nil, presumably, because you'd be creating a more closed environment. But then that wouldn't be the Internet, would it?
MSN has been hinting loudly for some time about its impending threat to Google. Where's the beef?
Personally, I don't know how they can "stop spam."
They may be able to reduce it significantly but if you consider that spammers are out to make money (like the rest of is) they are going to do whatever they can to circumvent the system.
I remember watching a program where they talked to an email spammer who freely admitted what he did. He could blast a million emails per day and only got like 2 or 3 percent return, but of those 2 or 3% he got paid many thousands of dollars per month for his service.
So, if that isn't an incentive to circumvent a system I don't know what is.
Lance Housley
06-16-2004, 07:55 AM
Insofar as search engines use links to quantify the relevance of webpages to an enquiry, then there is at least one drastic possibility for blocking spam.
Websites that provide links to spam sites must do it either because somehow the webmaster has been fooled about the content and / or relevance of the site he/she links to, or because the webmaster of the linking site is complicit in the spam operation. For a good while now, routine advice has been not to sign up with link farms, so there can be some justification for penalising those webmasters who do opt in to link schemes.
That being so, once one has identified (by examining content) a substantive number of spam sites, it would be possible to block not only the identified sites, but also those other sites which link to them. A sizeable proportion of these linking sites may be assumed to be spam sites too, and many of the rest will be mastered by idiots - almost by definition, because they consistently ignore the advice not to sign up to link farms. Either way, the search user loses out to a very limited extent (not having SERPS that contain spam sites or idiot's sites) even though some relevant results may also be blocked. :eek:
However, the question remains - what is spam? The problem is very similar to the question what plants are weeds. In gardening terms, weeds are simply plants which grow where they are not wanted. They are perfectly good plants in themselves, and valuable in their own place. This world needs biodiversity.
In the same way, many sites identified as spam are perfectly good sites in their own right, but are inappropriate in many of the SERPS they actually turn up in. Search engines strive to reduce this, but the SEO industry also needs to consider this. There are plenty of sites that really ought never to be top of the results lists, in the same way that some very useful books are never the first resource a reference librarian turns to. Such books normally don't even figure in small collections, but they are available when larger libraries are used, or when the enquirer turns to a specialist collection. Good publishers know this, and don't try to persuade small general libraries to purchase resources more appropriate for universities.
But the SEO industry seems to be dedicated to pushing every single site they can get their hands on into the top tranche of results. That's idiotic, but in view of this, why criticise the search engines when these sites do turn up in above-the-fold SERPS?
But it does suggest to me that there could be a very positive use for such devices as Google's "Supplementary results". Only, there's an awful lot of sites that ought to be in a "second-line" index, and one would need the option on a search page to select a button called "specialist results".
What do I know about all this, though? Not much perhaps, and certainly I approach it from the point of view of the searcher (and as one who trains professional searchers). But I'm a reference librarian, and Google people have quite often talked about Google trying to emulate Reference Librarians, admitting that they're not there yet. Just take a look at Google's Man Behind the Curtain (http://news.com.com/2008-1024_3-5208228.html) to see what I mean. :)