Search Engine Watch
SEO News

Go Back   Search Engine Watch Forums > General Search Issues > Search Technology & Relevancy
FAQ Members List Calendar Forum Search Today's Posts Mark Forums Read

Reply
 
Thread Tools
Old 03-23-2005   #1
dannysullivan
Editor, SearchEngineLand.com (Info, Great Columns & Daily Recap Of Search News!)
 
Join Date: May 2004
Location: Search Engine Land
Posts: 2,085
dannysullivan has much to be proud ofdannysullivan has much to be proud ofdannysullivan has much to be proud ofdannysullivan has much to be proud ofdannysullivan has much to be proud ofdannysullivan has much to be proud ofdannysullivan has much to be proud ofdannysullivan has much to be proud of
Can Tagging Help Search?

I blogged yesterday about how there's a lot of excitement over "tagging" that's in use in places like Flickr and Technorati. Now that Yahoo owns Flickr, some are wondering if tagging -- labeling posts, images and so on into different categories -- might help web search. Yahoo's game of photo tag is a good article from News.com that looks at this more and caused me to kick off my post. And Tags & Folksonomies - What are they, and why should you care? from Threadwatch is a nice roundup of what tagging is, if you need to come up to speed. Also see our other forum discussion, Questions about Ontologies/Taxonomies and Use in Modern Search.

In my post, I argue that we've had tagging of web pages for years and that the search engines don't use that information because it's not trustworthy. My feeling is that tagging is not somehow going to become a solution to better search relevancy even if it is "community driven" for all the same reasons -- it will ultimately be untrustworthy. Agree? Disagree? Please chime in!
dannysullivan is offline   Reply With Quote
Old 03-23-2005   #2
Alan Perkins
Member
 
Join Date: Jun 2004
Location: UK
Posts: 155
Alan Perkins will become famous soon enough
Quote:
OK, but those are self-provided tags! What if we let a community do tagging. Hey, the community already does that through links. Links are a form of tagging pages. And what have we found? Links will get misused, if there's a possible financial gain involved.

Mark me extremely dubious that tagging will make major inroads in improving search. And if I'm wrong, I'll happily mea cupla. But after 10 years of tagging, the experience so far gives me good reason to be dubious.
I'm in complete agreement with your position on this.
Alan Perkins is offline   Reply With Quote
Old 03-23-2005   #3
Sebastien Billard
French SEO blogger and consultant
 
Join Date: Oct 2004
Location: Lille, France
Posts: 39
Sebastien Billard will become famous soon enough
Tagging work when people are honest. But honesty leaves when profit is involved So tagging would be spammed like links today.

Though, a little bit of tag analysis could be used imho, but it should not be decisive. Juste one criteria in a million (or 100).
Sebastien Billard is offline   Reply With Quote
Old 03-23-2005   #4
DarkMatter
Master Blaster
 
Join Date: Feb 2005
Location: New Jersey,USA
Posts: 137
DarkMatter is on a distinguished road
authoritative tagging?

What if there was a reputable website/organization that would apply these tags according to certain criteria independent of the intentions of the blogger/content creator/website?

Imagine I create a blog, and I activate this "authoritative tagging" option (which could be a component of the blog software). Now, when I make a post, the rss feed goes to the "tag authority" site and enters a que for eventual evaluation by either a human editor or a peice of software, which then somehow transmits the unbiased tag entry to my blog and labels the post accordingly.

Hmmm the more I think about it, the logistics of that makes it nearly impossible to do (at least in a timely manner). It might work better if you just had some kind of software built into your blog that could evaluate and tag it (the same way adsense scans your page to discover the topic). So in this case you might choose to activate the "Google tagger" in your blog, which maybe would warrant a higher ranking in the search results.

This might actually work pretty well, just paste the code on your page (like adsense) and let Google label it. Or would this be redundant since basically G does that anyway when they spider your page?
DarkMatter is offline   Reply With Quote
Old 03-23-2005   #5
Alan Perkins
Member
 
Join Date: Jun 2004
Location: UK
Posts: 155
Alan Perkins will become famous soon enough
Quote:
Originally Posted by DarkMatter
Hmmm the more I think about it, the logistics of that makes it nearly impossible to do (at least in a timely manner).
I agree with your self analysis.
Quote:
Or would this be redundant since basically G does that anyway when they spider your page?
Yep.

Taking your ideas as a whole, however, it would be possible for some Adsense-like organization to autogenerate tags for a third party. Tagging text is not where the really difficulties lie though. Images, music, video and other binary formats are much more difficult.
Alan Perkins is offline   Reply With Quote
Old 03-23-2005   #6
Webvisitor
Member
 
Join Date: Jun 2004
Location: NearYosemite
Posts: 107
Webvisitor will become famous soon enough
Tagging could be the ultimate authority of organic search. Take the concept of Furl or De.licio.us, if I tag a page and a hundred other persons in my demographic do so you have a consensus. I suggest data organization via tagging is the purest form of data aggregation, truly organic, more pertinant than Pagerank where bought or irrelevant links can dominate.
I believe advertisers will ask for tagged data from aggregators first when the data can be properly presented.
Webvisitor is offline   Reply With Quote
Old 03-23-2005   #7
lots0
 
Posts: n/a
If the short history of the www shows us anything, it shows us that if any weight is given to these “tags” in any major search engine algo, these tags will be spammed into uselessness in a very short time.

It sounds like a great idea, label your content appropriately...

I don’t know about anyone else, but I think appropriately labeling your content is just a basic part of website design.
  Reply With Quote
Old 03-23-2005   #8
DarkMatter
Master Blaster
 
Join Date: Feb 2005
Location: New Jersey,USA
Posts: 137
DarkMatter is on a distinguished road
I think you're right webvisitor, community input is probably the best way to get accuracy in tags. But I would tend to think it would only work well if you had a large community actively tagging....too small a sample would be easy to manipulate artificially. Most websites probably don't have large active communities like that.

Take wikipedia for example: the pages with only one or two people working on them are more likely to be skewed to one person's opinion, but the pages with many active users will be able to maintain more of a general consensus.
DarkMatter is offline   Reply With Quote
Old 03-23-2005   #9
Alan Perkins
Member
 
Join Date: Jun 2004
Location: UK
Posts: 155
Alan Perkins will become famous soon enough
Hmm, if the community of people doing the tagging have to pay for the privilege, or jump some equally onerous membership barrier, and that community is policed to the extent that the burden of often being booted out and rejoining is too great to make it worth spamming, then the tagging may work.
Alan Perkins is offline   Reply With Quote
Old 03-23-2005   #10
Webvisitor
Member
 
Join Date: Jun 2004
Location: NearYosemite
Posts: 107
Webvisitor will become famous soon enough
[quote=DarkMatter] But I would tend to think it would only work well if you had a large community actively tagging....too small a sample would be easy to manipulate artificially. Most websites probably don't have large active communities like that.
QUOTE]

I agree DarkMatter it will only work well if a large community ie. an SE aggregates the tags and yes it would only work with a representative "larger" sampling.
I disagree with the assertion made that spamming would be an huge issue. Spammers don't work hard enough to spoil or seed a tagged subject to the degree it would alter the data.
I am watching Furl and how LookSmart will use the data from Furled pages. Del.icio.us is not aligned with an SE so it will be more difficult to measure how those tags/bookmarks are used.
Webvisitor is offline   Reply With Quote
Old 03-23-2005   #11
xan
Member
 
Join Date: Feb 2005
Posts: 238
xan has a spectacular aura aboutxan has a spectacular aura about
I don't think anything easily manipulated would be used. This means the search engines have to index these in a chosen way. Classification again. Its already used in the backend. I think it could be extended though, just not manually or by the site/blog owners.

well that's my idea anyway
xan is offline   Reply With Quote
Old 03-23-2005   #12
lots0
 
Posts: n/a
Quote:
I disagree with the assertion made that spamming would be an huge issue.
To support my point that these tags would be spammed, all you have to do is look at how meta description and keyword tags have been abused and basically discounted by the search engines.

IMO, these "new" tags are nothing more than an extension of the meta description and keyword tags.
  Reply With Quote
Old 03-23-2005   #13
Nacho
 
Nacho's Avatar
 
Join Date: Jun 2004
Location: La Jolla, CA
Posts: 1,382
Nacho is a splendid one to beholdNacho is a splendid one to beholdNacho is a splendid one to beholdNacho is a splendid one to beholdNacho is a splendid one to beholdNacho is a splendid one to beholdNacho is a splendid one to beholdNacho is a splendid one to behold
I like very much Nick's decription of tags:
Quote:
So What Makes Tags Important?
Simply put, tags are important because they allow your users to generate content and classify that content in their own unique way.
I see it as a way to incrase semantic associations and connectivity between webpages by better organizing and defining the infomation on the page. Can this be abused and therefore SEs define this as "not trustworthy"? Yes, as pretty much everything that happens on the web already.

I agree with you Danny that tags are not ready for the web today, but hopefully they can be improved in a way that it will work for the future to be trustworthy. I believe that it's in everyone's (search engines, webmasters and users) best intentions to see improvement on search engine's results for any given query through algorithmic definition which excludes manual editorial reviews.
Nacho is offline   Reply With Quote
Old 03-23-2005   #14
xan
Member
 
Join Date: Feb 2005
Posts: 238
xan has a spectacular aura aboutxan has a spectacular aura about
This is all about the "semantic web" again, I hate the definition-its silly

In digital libraries everything is "tagged". There's lots of different way to do this like Dublin Core, OWL, RDF, ...

I can't see how it can be used any time soon on the public web, as its not standardized for it, it's not even researched properly yet (there is still work), inference rules have to be imposed, ...

There's XML which has been a great idea, sowing seeds for progress, but it doesn't tell you anything about what the structure of the document means.

RDF (RDF/XML) solves this to an extent,as it is structured as one or more Triples. A Triple is: (1) the subject , (2) the property and (3) the actual value (all Universal Resource Identifier (URI)). With RDF a machine can recognise different sets of vocabulary as well.
URIs make sure that concepts are not words in a page but are linked to a unique definition so everyone can find it on the Web.

To this you also have the subject of Ontologies, which I think there is a thread on already, or there's a lowdown on my blog.

So great, all these things exist but aren't practical because of manipulation, errors, no standards, etc...

People are talking about "the Semantic Web's unifying language" (which is the logical inferences made using rules and information such as those specified by ontologies). Things called proofs get exchanged between agents in order to make a descision on a result. Ontologies would exist for a large number of resources and would have to be brought together into a new model.
The semantic structure is just a foundation for complex A.I techniques, which will decide what belongs where, what the relationships are, what's going on, who wants what and what is best,...

Its possible and it has been discussed that digital signatures could be used to validate documents. In fact they could be parsed through a validation service between upload and loading. If markup like this is allowed, it will be difficult to manipulate I think.

A lot of very famous faces like Susan Dumais for example do not believe in the "semantic web".

This is an excellent book, even though it was written 1999 - how far have we really come?

Weaving the Web: The Original Design and Ultimate Destiny of the World Wide Web by Its Inventor.
Tim Berners-Lee, with Mark Fischetti. Harper San Francisco, 1999.
xan is offline   Reply With Quote
Old 03-23-2005   #15
projectphp
What The World, Needs Now, Is Love, Sweet Love
 
Join Date: Jun 2004
Location: Sydney, Australia
Posts: 449
projectphp is a splendid one to beholdprojectphp is a splendid one to beholdprojectphp is a splendid one to beholdprojectphp is a splendid one to beholdprojectphp is a splendid one to beholdprojectphp is a splendid one to beholdprojectphp is a splendid one to behold
IMHO, tagging can, does and will help. The question is how much. When I see an ad for a resteraunt proclaiming "best customer service ever", I am obviously sceptical. When I see a review in a news paper proclaiming "best customer service ever" I am less skeptical, and likely to believe. When my best friend whom I dine with regularly and has similar expectations of customer service says to me "best customer service ever" I probably believe it without question.

The level of trust in each case is important, but in all cases the fact the statement is made will ahve some impact on my decision to believe or not.

SE tagging could play a similar role. Use the info from tags, just don't trust it much. This would mean tags only have much influence when there is little else to go on. So, if I wanted a photo of the concord, tagging shouldn't be relied on (as there are better methods). If, as the old Steven Wright joke goes, I wanted a "... rare picture of Norman Rockwell beating up a child", then trusting tags is as good a way as another to source such a photo that probably doesn't exist.

What would be great though, in terms of tagging, is more negative tags. noimageindex, nonewsindex etc etc. that will help the indexing of stuff, because it will ensure that copyrighted material isn't searchable, and people are kept out of innapropriate indexes.
projectphp is offline   Reply With Quote
Old 03-25-2005   #16
andrewgoodman
 
andrewgoodman's Avatar
 
Join Date: Jun 2004
Location: Toronto
Posts: 637
andrewgoodman is a name known to allandrewgoodman is a name known to allandrewgoodman is a name known to allandrewgoodman is a name known to allandrewgoodman is a name known to allandrewgoodman is a name known to all
I love Flickr (what little I know of it as a newcomer to the service). I believe that tagging works well there but relies on the fact that it's a community of reasonable people.

Clearly SE's are coming back to "workarounds" that amount to reintegrating metadata & categorization into search where it failed before. Per my recent blog about wcities.com, they power Yahoo Travel > Restaurants and have some very cool info in their database, such as neighborhood, which is a pretty localized concept.

Google Local Business Center strikes me as a quiet entry into this realm as well. Having sites/businesses enter categorized info about themselves... it relies on them being trustworthy but if the format of the listings is different from general search, I'd be optimistic that it would be less spammable.

Certainly I doubt that many businesses would find it helpful to tag themselves as being located in "Parkdale" when in fact they are in "Cabbagetown."

So on the whole, I think that some ad hoc tagging is going to be helpful. It's going to be necessary to go in that direction because a unified metadata protocol or universal naming system just doesn't seem to be on the table. If you've taken a photo involving a cat standing near a brick wall, I think there is probably some good will there, you will probably tag your photo with cat and brickwall, and that you may not have a lot of incentive to also put j-lo ringtones cash viagra etc. in there. Plus, if the community is of reasonable size, I would think the organizers of something like Flickr could just bounce you for putting spam in tags. So much for purely automated search.
andrewgoodman is offline   Reply With Quote
Old 03-29-2005   #17
NetinsertGuy
Organize the web - www.Netinsert.com
 
Join Date: Jun 2004
Posts: 11
NetinsertGuy is on a distinguished road
We have believed in the concept of "tagging" since 1999 when we first sat down at the drawing board with the intention of designing a web directory which is self organized by the webmasters themselves. Early on we concluded that the only feasible way of deferring categorization on a large scale to the web commnunity is to use "tagging" in some form. The tag in question lets the webmaster label the web page with a value which refers to subject category in a taxonomy. The tag can be retrieved by a robot and used by a "web directory engine" to organize the web in a fully automated process. We decided to use the meta tag as a data vehicle since it is unobtrusive, light-weight, and has a standardized name-value pair format.

So far our experiences with a community driven tag based categorization of the web has not only met our expectations but far exceeded them:

1) 90-95% of all submissions are correctly classified
2) submissions that require administrator intervention mostly involve assigning a different classification. They rarely involve a delete, and almost never a block.
3) The general acceptance for tag based categorization is increasing. From an initial acceptance rate of less than 10% the acceptance rate is now over 50% (acceptance rate is measured as the number of listed web pages divided by the number of submissions per day)
4) the number of candidate categories is increasing (user suggested subcategories when a category is full)
5) Spam has not been as big a problem as one might have expected. Somehow the tag concept in combination with administrative supervision seems to prevent the worst forms of spam.
6) People want to be a part of the organization of the web

In our experience there is no doubt that a community driven tag based paradigm really does work and, to answer Danny's question, yes it can help web search. A bottom up user driven categorization scheme in combination with a top down key word algorithmic search may become a "meet in the middle" search tool that gives SERPs of unexpected quality to the end user.
NetinsertGuy is offline   Reply With Quote
Old 04-06-2005   #18
claus
It is not necessary to change. Survival is not mandatory.
 
Join Date: Dec 2004
Location: Copenhagen, Denmark
Posts: 62
claus will become famous soon enough
Quote:
Originally Posted by dannysullivan
In my post, I argue that we've had tagging of web pages for years and that the search engines don't use that information because it's not trustworthy. My feeling is that tagging is not somehow going to become a solution to better search relevancy even if it is "community driven" for all the same reasons -- it will ultimately be untrustworthy. Agree? Disagree? Please chime in!
Meta tags are set by the webmaster and are as such "tainted", while tags used like, say, del.icio.us, are set by a body of surfers more or less un-equal to the webmaster. So, the latter is akin to linking while the former is akin to on-page factors, IMHO.

As for "trust" - well, that depends on who's doing the tagging, and how.

Try looking at some controversial self-tagging, say "flickr > porn" or something. Even with an adult filter turned way up, i wouldn't call this pr0n. Just like the days when people were putting the word "sex" in every meta tag to get traffic (i still see this on sites you wouldn't believe - last example i saw was a gardener web site).

Still, i believe there are some SEO benefits in all this nonsense, but that's another story. And besides being off-topic based on that, it doesn't really relate to the tagging itself either, more to the fact that a lot of people think they're fun to play around with (sort of like blogs, but different)
claus is offline   Reply With Quote
Old 04-12-2005   #19
michaelb
 
Posts: n/a
Smile There is a community (of sorts) that uses meta tags

A number of posts in this thread talk about communities creating meta tags. If you can describe a bureaucracy as a community then the Australian Government already does it. All Aus Gov website have their high level pages tagged with a dublin core set of meta tags. This doesn't help public search engines generally, but it does dramatically simplify maintenance of what are called the entry points, their content and search engines use .gov.au domain meta tags.
the entry point is www.australia.gov.au and the standards are at www.agls.gov.au. agls stands for Australian Government Locator Service. The driving force behind it was the government archives whose record keeping mandate was being seriously threatened by online only information.
There is another entry point www.business.gov.au which pre-dates agls. It used a different voluntary standard to bring together government web sites dealing with business in Oz.
The thing is though, that if the bureaucrats didn't tag the page, you'll never find it from the entry point - but Google will if you look hard enough.
  Reply With Quote
Old 04-14-2005   #20
hjalli
Founder of Spurl.net and Zniff.com
 
Join Date: Aug 2004
Location: Iceland
Posts: 2
hjalli is on a distinguished road
An article about tagging and human information in search

Some of you might be interested in this article I posted on my blog yesterday on the effects of tagging, meta tags and human information on search:

Coming to terms with tags: folksonomies, tagging systems and human information
hjalli is offline   Reply With Quote
Reply


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off