Search Engine Watch
SEO News

Go Back   Search Engine Watch Forums > General Search Issues > Search Technology & Relevancy
FAQ Members List Calendar Forum Search Today's Posts Mark Forums Read

Reply
 
Thread Tools
Old 04-08-2005   #1
orion
 
orion's Avatar
 
Join Date: Jun 2004
Posts: 1,044
orion is a splendid one to beholdorion is a splendid one to beholdorion is a splendid one to beholdorion is a splendid one to beholdorion is a splendid one to beholdorion is a splendid one to behold
Exclamation Deceiving Relevancy

One of the AIRWeb papers on spamming, courtesy of Garcia-Molina's group (Stanford) and no strange to Google is
Web Spam Taxonomy

Here are some interesting lines

ABOUT WHAT IS SPAMMING

"We use the term spamming (also, spamdexing) to refer
to any deliberate human action that is meant to
trigger an unjustifiably favorable relevance or importance
for some web page, considering the page’s true
value. We will use the adjective spam to mark all those
web objects (page content items or links) that are the
result of some form of spamming. People who perform
spamming are called spammers."

"One can locate on the World Wide Web a handful of
other definitions of web spamming. For instance, some
of the definitions (e.g., [13]) are close to ours, stating
that any modification done to a page solely because
search engines exist is spamming. Specific organizations
or web user groups (e.g., [9]) define spamming by
enumerating some of the techniques that we present in
Sections 3 and 4."

ABOUT THE PERCEPTION OF OUR SEO INDUSTRY


"An important voice in the web spam arena
is that of search engine optimizers (SEOs), such
as SEO Inc. (***//***.seoinc.com) or Bruce Clay
(****//***.bruceclay.com). The activity of some SEOs
benefits the whole web community, as they help authors
create well-structured, high-quality pages. However,
most SEOs engage in practices that we call spamming.
For instance, there are SEOs who define spamming
exclusively as increasing relevance for queries not
related to the topic(s) of the page. These SEOs endorse
and practice techniques that have an impact on importance
scores, to achieve what they call “ethical” web
page positioning or optimization. Please note that according
to our definition, all types of actions intended
to boost ranking (either relevance, or importance, or
both), without improving the true value of a page, are
considered spamming."

ABOUT DECEIVING SEARCH ENGINES THAT IGNORE IDF TERM VECTOR MODELS

(Mostly the poorly programmed one).

"With TFIDF scores in mind, spammers can have two
goals: either to make a page relevant for a large number
of queries (i.e., to receive a non-zero TFIDF score), or
to make a page very relevant for a specific query (i.e.,
to receive a high TFIDF score). The first goal can be
achieved by including a large number of distinct terms
in a document. The second goal can be achieved by repeating
some “targeted” terms. (We can assume that
spammers cannot have real control over the IDF scores
of terms. Moreover, some search engines ignore IDF
scores altogether. Thus, the primary way of increasing
the TFIDF scores is by increasing the frequency of
terms within specific text fields of a page.)"

OTHERS SPAM TACTICS RELATED TO...

BODY
TITLES
META TAGS
URLS
ANCHOR TEXT/TAGS
DRIVE-BY TERMS DROPPING

et....

Enjoy it.

I was in communication with Baeza-Yates and Brian Davidson. More reaching out activities are coming. These may develop into a better understanding of the perception at both sides of the fence (IR and SEO colleages). Sorry for the redundancy and wordiness.

Orion

Last edited by orion : 04-08-2005 at 02:33 PM.
orion is offline   Reply With Quote
Old 04-08-2005   #2
orion
 
orion's Avatar
 
Join Date: Jun 2004
Posts: 1,044
orion is a splendid one to beholdorion is a splendid one to beholdorion is a splendid one to beholdorion is a splendid one to beholdorion is a splendid one to beholdorion is a splendid one to behold
Exclamation

Forget to mention about deceiving PageRank and link farms renamed now as "link building" strategies.

Now this Acknowledgment is hilarious

"This paper is the result of many interesting discussions
with one of our collaborators at a major search engine
company, who wishes to remain anonymous. We would
like to thank this person for the explanations and examples
that helped us shape the presented taxonomy
of web spam."

HUM!, Stanford, Garcia-Molina, and some of the reference papers...?

C'mon, children.

Orion
orion is offline   Reply With Quote
Old 04-08-2005   #3
NFFC
"One wants to have, you know, a little class." DianeV
 
Join Date: Jun 2004
Posts: 468
NFFC is a splendid one to beholdNFFC is a splendid one to beholdNFFC is a splendid one to beholdNFFC is a splendid one to beholdNFFC is a splendid one to beholdNFFC is a splendid one to behold
I'm not sure what you think of SEO's but that paper is way old, been read and dissected at many meets.

>HUM!, Stanford, Garcia-Molina, and some of the reference papers...?

It was Yahoo.
NFFC is offline   Reply With Quote
Old 04-08-2005   #4
orion
 
orion's Avatar
 
Join Date: Jun 2004
Posts: 1,044
orion is a splendid one to beholdorion is a splendid one to beholdorion is a splendid one to beholdorion is a splendid one to beholdorion is a splendid one to beholdorion is a splendid one to behold
Exclamation

FYI, NFFC

The paper in question was taken from Gary Price

http://blog.searchenginewatch.com/blog/050407-190947

Gary writes

Updated Research Paper: A Taxonomy of Web Spam

A week ago, Chris blogged about the First International Workshop on Adversarial Information Retrieval on the Web that will be part of the WWW2005 Conference next month in Japan.

One of the papers that will be presented at the conference: Web Spam Taxonomy, by Zolta Gyongyi and Hector Garcia-Molina from the Stanford Database Group has been updated and is now available full text (9 pages; PDF) online.

It's a very interesting read.

From the abstract:


Web spamming refers to actions intended to mislead search engines into ranking some pages higher than they deserve. Recently, the amount of web spam has increased dramatically, leading to a degradation of search results. This paper presents a comprehensive taxonomy of current spamming techniques, which we believe can help in developing appropriate countermeasures.


Posted by Gary Price on Apr. 7, 2005 |


Apr 7, 2005 does not sound old to me. The material and issues at hand are indeed as old as the SEO industry.

Orion

PS

Furthermore, check following reference dates in the paper.

Monica Bianchini, Marco Gori, and Franco
Scarselli. Inside PageRank. ACM Transactions
on Internet Technology, 5(1), 2005.

Zolt´an Gy¨ongyi and Hector Garcia-Molina. Link
spam alliances. Technical report, Stanford University,
2005.

Last edited by orion : 04-08-2005 at 04:20 PM.
orion is offline   Reply With Quote
Old 04-08-2005   #5
rcjordan
There are a lot of truths out there. Just choose one that suits you. -Wes Allison
 
Join Date: Jun 2004
Posts: 279
rcjordan is a name known to allrcjordan is a name known to allrcjordan is a name known to allrcjordan is a name known to allrcjordan is a name known to allrcjordan is a name known to all
Page 7: The second data set (DS2) was the result of a single breadth-first search started at the Yahoo! home page, conducted between July and September 2002.
rcjordan is offline   Reply With Quote
Old 04-08-2005   #6
orion
 
orion's Avatar
 
Join Date: Jun 2004
Posts: 1,044
orion is a splendid one to beholdorion is a splendid one to beholdorion is a splendid one to beholdorion is a splendid one to beholdorion is a splendid one to beholdorion is a splendid one to behold
Exclamation

This paper, as many to be presented at AIRWeb are recaps of how IRs perceive SEOS and these issues, which are old, of course.

I have spent good of my time last few weeks discussing with AIRWeb folks some of the material to be presented at the activity. They got stuck with IRs submitting old things to a new event. That explains everything.

Orion
orion is offline   Reply With Quote
Old 04-08-2005   #7
NFFC
"One wants to have, you know, a little class." DianeV
 
Join Date: Jun 2004
Posts: 468
NFFC is a splendid one to beholdNFFC is a splendid one to beholdNFFC is a splendid one to beholdNFFC is a splendid one to beholdNFFC is a splendid one to beholdNFFC is a splendid one to behold
>Apr 7, 2005 does not sound old to me.

In the SEO world its ancient history, tomorrow is all that counts.

Point taken though, it is a modern day rehash of any old paper, one I've read many times. Nothing in your summation suggested anything new.

>I have spent good of my time last few weeks discussing with AIRWeb folks some of the material to be presented at the activity.

I've spent a good part of mine discussing similar things, although we are looking at what we *think* will be in the 2006 one. Chess and draughts?
NFFC is offline   Reply With Quote
Old 04-08-2005   #8
orion
 
orion's Avatar
 
Join Date: Jun 2004
Posts: 1,044
orion is a splendid one to beholdorion is a splendid one to beholdorion is a splendid one to beholdorion is a splendid one to beholdorion is a splendid one to beholdorion is a splendid one to behold
Exclamation

Ok, I let you save some face.

Going back to AIRWeb, the problem is those papers were suppose to address new issues. Nothing new has been presented, yet. We agree on that.

One more thing, what makes you think the Acknowledgement part of the paper quoted by Gary Price refers to Yahoo and not Google or how knows? On the other hand Jordan has a good point with the Yahoo data.

Orion

Last edited by orion : 04-08-2005 at 05:48 PM.
orion is offline   Reply With Quote
Old 04-08-2005   #9
NFFC
"One wants to have, you know, a little class." DianeV
 
Join Date: Jun 2004
Posts: 468
NFFC is a splendid one to beholdNFFC is a splendid one to beholdNFFC is a splendid one to beholdNFFC is a splendid one to beholdNFFC is a splendid one to beholdNFFC is a splendid one to behold
>Ok, I let you save some face.

I have no face, I only have rank. That is what defines me, learn to love it, as that is the SEO mindset..I rank therefore I am.

>One more thing, what makes you think the Acknowledgement part of the paper quoted by Gary Price refers to Yahoo and not Google

I just know.
NFFC is offline   Reply With Quote
Old 04-08-2005   #10
orion
 
orion's Avatar
 
Join Date: Jun 2004
Posts: 1,044
orion is a splendid one to beholdorion is a splendid one to beholdorion is a splendid one to beholdorion is a splendid one to beholdorion is a splendid one to beholdorion is a splendid one to behold
Exclamation

Ok.

Orion
orion is offline   Reply With Quote
Reply


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off