Search Engine Watch
SEO News

Go Back   Search Engine Watch Forums > General Search Issues > SEM Related Organizations & Events
FAQ Members List Calendar Forum Search Today's Posts Mark Forums Read

Reply
 
Thread Tools
Old 03-01-2005   #1
rustybrick
 
rustybrick's Avatar
 
Join Date: Jun 2004
Location: New York, USA
Posts: 2,810
rustybrick has much to be proud ofrustybrick has much to be proud ofrustybrick has much to be proud ofrustybrick has much to be proud ofrustybrick has much to be proud ofrustybrick has much to be proud ofrustybrick has much to be proud ofrustybrick has much to be proud of
Indexing Summit - SES NYC 05

This is Danny Sullivan's pet session. He introduces the session as talking about the issues with link spam and other types of spam. Danny said he wanted a noindex tag for a specific sections of the page. Instead of the nofollow tag. Matt Cutts spearheaded the nofollow tag. He discussed the forum thread on this. On the panel is Ask Jeeves, MSN, Google and Yahoo!. By the way, I have Kim Krause & Bill S. on my right from Cre8asite, randfish, orion, Mike Grehan and Christine Churchill on my left.

Matt Cutts from Google was up first and showed a slide of guest-book spam, he explains that this link is not a true vote. What they needed is to allow webmasters to mark up links on their site to say "I did not vouch for this link." Danny then had an indexing summit article and then they contacted Weblog companies, then asked Yahoo and MSN and Ask Jeeves for suppore (MSN & Yahoo supported it). It has only been 6 weeks since it has been implemented and they have already seen a positive impact. He then shows the no follow tag which looks like <a href="http://www.example.com/" rel="nofollow">discount pharmacy</a>. He then showed about 20+ companies (search and blogs) that support this tag. They have already seen positive impacts. Its better then not having it he said. Spammers hate it he said, just like wearwolves hate silver bullets (I believe he made a comment towards Nick Wilson about his blog and spammer followers hating it - Nick, eat that up please). Spammers are shifting towards different types of spam. Spammers are moving toward smaller blog packages. Better lines of communication with software makers and search engines. Yahoo hosted a web spam "squashing" summit last week. We're open to future cooperation.

Tim Mayer from Yahoo! was up next with his "Comment Spam Proposal." He said Yahoo! came up with a slightly different proposal then Google. Yahoo! just rolled out support for the nofollow tag LAST NIGHT, so see changes in the index shortly. He talked about the summit they held at Yahoo! and said it was weird having Matt Cutts on the Yahoo! campus. The key thing is to solve the exploitation of publicly modifiable areas on prominent sites. He says the nofollow is not a semantic tag, its not descriptive of the content. Yahoo! recommends blocking of certain components of the pages. They are proposing <div class='content-public'>...</div> Content within the tag is publicly contributed by anyone. So he showed you should put this tag for blog entries. Additional ones are <div class='content-nav'>...</div> and <div class='content-default'>...</div> He then highlights the SEW site and highlights the nav and ads and said, you would block out those. He said there is also the possibility of using link level tags (more granular control), <a href="..." rel="content-public">. That is the Yahoo! proposal.

Kaushal Kurapati from Ask Jeeves was next up, remember Ask did not join forces with Google, Yahoo, and MSN. He gives a brief overview of Jeeves and how Ask Jeeves works. Crawler goals: (1) follow robot.txt standard (2) politeness; crawl delay, noarchive, noindex, nofollow; (3) efficiency - use compression methods and do not crawl duplicate pages. Indexing overview: they index html, pdf, flash, ms-office, etc., freshness through date stamping content, and completeness is important (site maps help). Some generic tips on how do use links and content. Challenges include; JavaScript, Dynamic Pages, and Long Pages. They removed the paid site submission. They say, don't buy links, it wont help. Do park domains help, nope. They want unique content. The trends for Jeeves; personal indexing with My Jeeves which is a personal crawl (in a sense). Social tagging, how people collectively refer to a page and more fodder for indexing.

Eytan Seidman from MSN Search was last up. He was not asked to bring slides. So he is running off some notes. They support nofollow starting about 2 weeks ago. They have full support on robot.txt and crawl delay. They first think about "discovering content" and can they leverage RSS to better discover new content. Once they have the content, how do they do a better job of interpreting that content? He said in email spam, there is a community approach to blocking it, can we do the same in web spam? The last thing is that people in forums have been asking for more tools to see what was indexed and not indexed and why. Please send feedback via the results page or contact page and keep it coming they are reading....

Q & A:
Danny asked Qs to audience (percentages are all my estimates, I wonder if Danny got the same numbers):
- 40% in the room said they want better support of 301s
- Most said they want more feedback about their site, support.
- 10% Express indexing
- 10% Many tools are stripping out referral info (toolbars)
- 90% Duplicate content handling
- 20% Domain identify, i have 50 domain names all to the same place
- 40% Weather reports, tell us when your changing the algorithms
- 0% robot.txt more standardized
- 0% on finding search result pages on the search results
- 5% nofollow stuff
- 2% on dynamic url issues
- 0% trusted dates (page date stamping)
- 50% feel meta data should come back, is it coming back, should engines now support it more
- 40% are in favor in web spam reporting

Q: I asked a bit about block level link analysis based on Yahoo!'s proposal.
A: Tim Mayer said they are moving somewhat in that direction.

Q: Nacho asked, how do we authenticate your crawlers? Sometimes people spoof the crawler.
A: Tim Mayer said that you can authenticate via the IP address. Ask Jeeves agreed. Um, hire fantomasters's ip list.

Q: How about a relevance authority tag system? Like eBay reviews, etc. An independent score, authenticate. And then you want to quantitatively score that. And then a qualitative assessment.
A: Interesting ideas.

Q: Webby asked the next question. He thanked Google for the nofollow tag. But he still gets spam. Can you put a logo on a page that shows its a nofollow tag.
A: Matt said it will take time and people will learn.

Q: How do the crawlers actually treat the nofollow tag?
A: Matt said its a good question. He said, this is a vote abstain. Google specifically does not allow its crawlers to follow those links AT THIS TIME. Tim adds that when the agreement was made, the engines did not decide on the behavior of the engines. Tim didn't answer the question, I think he didn't know.

Q: Danny Sullivan said its sometimes easier to send you a plain text document of the page instead of tagging everything (hinting cloaking).
A: Tim said its a trust issue. MSN adds that with a programmatically method its allows the engines to determine the content of the page and not the publisher.

Give me my "Link Love" - Matt Cutt's quote. Classic statement.

There were probably three questions about "commenting out the navigation". Its not commenting it out. Its basically telling the engines where you navigation is. So now it can be used to better determine the content on the page versus the crawling of the page (links).
rustybrick is offline   Reply With Quote
Old 03-01-2005   #2
TheotherTim
Official Y! Rep
 
Join Date: Jun 2004
Location: Palo Alto, CA
Posts: 27
TheotherTim will become famous soon enoughTheotherTim will become famous soon enough
Hi Barry,
I do know the behavior that we implemented for the no follow tag. My proposal was about using semantic tags which describe content which allows the search engine to prescribe the appropriate behavior for each specific tag type. This behavior may or may not be different across the search engines. We just launched support for the no follow tag last night. We have a treatment we implemented but we may decide to change this as more people adopt the use of the tag. I would prefer that content publishers use the tag in a way that they are saying that this is untrusted or public content rather than trying to initiate a specific behavior by the search engine. That is why I felt it wasn't really consistent with my proposal to describe our implemented behavior.
Thanks for covering the sessions,
Tim
TheotherTim is offline   Reply With Quote
Old 03-01-2005   #3
Michael Martinez
Member
 
Join Date: Jul 2004
Posts: 336
Michael Martinez is on a distinguished road
Angry

The NOFOLLOW tag is a horrendously bad idea and it may eventually come back to bite these non-search companies that have implemented it in the rear if they don't allow their users to determine specifically where it is to be applied (or provide some other flexibility).

I have participated in a discussion about this tag in the VBulletin support forum. The consensus there is that the forum operators want to CONTROL where the tag is placed, and not simply for VBulletin to blast it into every link that is embedded in content.

SOME of us actually WANT those tags to be followed. Why? Because we police the bad spammers off our boards.

If the blog service companies only insert the tag into comment sections (some of them are not clear on where they are inserting it), then most people probably won't mind. However, if they just force all outbound links to use the tag, they are going to have a lot of unhappy bloggers. I am finding an increasing trend among first-time and non-technical personal page creators toward using commercial blog servers as the foundations for their homepages. They believe they will have a voice through their outbound links.

Was the system abused by bloggers in the past? Sure. But just because SOME people abused it doesn't mean that everyone should be forced to suffer. In fact, this is really about Google trying to save face over its originally stupid idea: link popularity. While link popularity can be a generally fair way to mediate ranking between otherwise equally relevant sites, it is inherently vulnerable to multiple forms of manipulation.

Blog services that provide their users with a measure of control over which links can be followed and which cannot will win out over any services that arbitrarily determine for their users that all outbound links will be tagged with NOFOLLOW.

The blog service providers need to state up front and in clear and precise language whether they are using the tag, where it will be used, and what control (if any) their (paying) users will have over the tag's implementation.

Any commercial service which implements this tag should place itself in the forefront of the ranks of the educators who teach the user community about this tag and the consequences it bears for everyone involved.

All Google has done is buy itself some breathing space before the next onslought begins.
Michael Martinez is offline   Reply With Quote
Old 03-01-2005   #4
DaveAtIFG
Highly experienced lurker
 
Join Date: Jul 2004
Location: Arizona
Posts: 48
DaveAtIFG will become famous soon enoughDaveAtIFG will become famous soon enough
Quote:
All Google has done is buy itself some breathing space
That's pretty much "business as usual" or the "history in a nutshell" between SEs and spammers, isn't it?
DaveAtIFG is offline   Reply With Quote
Old 03-01-2005   #5
Michael Martinez
Member
 
Join Date: Jul 2004
Posts: 336
Michael Martinez is on a distinguished road
Post

Quote:
Originally Posted by DaveAtIFG
That's pretty much "business as usual" or the "history in a nutshell" between SEs and spammers, isn't it?
Yes.

Of course, with all the hoopla over the NOFOLLOW tag, the spammers have had plenty of time to change course. I would say it's a bit early for Google or anyone else to claim that they dealt the spammers a bloody nose or caught them by surprise. I can still find plenty of spam in the index.
Michael Martinez is offline   Reply With Quote
Old 03-01-2005   #6
rustybrick
 
rustybrick's Avatar
 
Join Date: Jun 2004
Location: New York, USA
Posts: 2,810
rustybrick has much to be proud ofrustybrick has much to be proud ofrustybrick has much to be proud ofrustybrick has much to be proud ofrustybrick has much to be proud ofrustybrick has much to be proud ofrustybrick has much to be proud ofrustybrick has much to be proud of
Quote:
Originally Posted by TheotherTim
Hi Barry,
I do know the behavior that we implemented for the no follow tag. My proposal was about using semantic tags which describe content which allows the search engine to prescribe the appropriate behavior for each specific tag type. This behavior may or may not be different across the search engines. We just launched support for the no follow tag last night. We have a treatment we implemented but we may decide to change this as more people adopt the use of the tag. I would prefer that content publishers use the tag in a way that they are saying that this is untrusted or public content rather than trying to initiate a specific behavior by the search engine. That is why I felt it wasn't really consistent with my proposal to describe our implemented behavior.
Thanks for covering the sessions,
Tim
Thanks Tim for clarifying.

SEW Forum members, if you have direct questions please list them below. I think I have a schedule meeting with the Yahoo! folks tomorrow at lunch. So, I can ask them anything you want. Please let me know your questions by 10:30AM (EST).
rustybrick is offline   Reply With Quote
Old 03-01-2005   #7
rustybrick
 
rustybrick's Avatar
 
Join Date: Jun 2004
Location: New York, USA
Posts: 2,810
rustybrick has much to be proud ofrustybrick has much to be proud ofrustybrick has much to be proud ofrustybrick has much to be proud ofrustybrick has much to be proud ofrustybrick has much to be proud ofrustybrick has much to be proud ofrustybrick has much to be proud of
I should stress, based on the comments left in this thread, that Google said over and over again, this is one small step in the right direction. Meaning, they have more things planned.

Hey, Yahoo! hosted a summit on this topic and invited the other engines to discuss. So they are serious about fighting spam. Of course, spammers will always be one step ahead, IMO.
rustybrick is offline   Reply With Quote
Old 03-02-2005   #8
projectphp
What The World, Needs Now, Is Love, Sweet Love
 
Join Date: Jun 2004
Location: Sydney, Australia
Posts: 449
projectphp is a splendid one to beholdprojectphp is a splendid one to beholdprojectphp is a splendid one to beholdprojectphp is a splendid one to beholdprojectphp is a splendid one to beholdprojectphp is a splendid one to beholdprojectphp is a splendid one to behold
Quote:
... that Google said over and over again, this is one small step in the right direction. Meaning, they have more things planned.
So does that mean Google et al will consult with the Webmaster community, and hopefully the W3C, in the future when coming up with new crawler iniatives? Or is this a one way street with the "more things planned" developed in house with little consultation?

I am a bit surprised that this session appears to be (and I wasn't there) the engines telling what their ideas are. I had the impression this was to be a chance for everyone to discuss future developments, and to start a dialogue on initiatives both parties can benefit from; a kind of robots 2.0 standard, with meta tags,, robots.txt and other coding attributes thrown out there. The crawlers ideas, while welcome, are only half of the concerns here.

Question: where any of the ideas from the other thread raised?

What I would like to see is some consensus on how all the engines will approach a new initiative, like the link nofollow attribute, in the future. IMHO, a bunch of engines implementing a bunch of proprietary, non-documented commands in a haphazard way is a recipe for disaster, and the opposite of making any of the stakeholder's lives easier or better.

The nofollow link attribute is a perfect example of this. It was hastily implemented, has no official, common definition I can find, and no official documentation anywhere, www.google.com+nofollow+link&num=100&hl=en&lr=&fil ter=0]including on Google[/url], the originator of the initiative, outside a Blog post (hardly an official source). So what does it do? Will a link not be followed? Will it just not count for PageRank (but still be followed)? Will this mean the page linked to should never be indexed unless another page links to it?

I would like to see some consensus on what an SE should exactly do with a nofollow link attrribute, and see this documented. This could then be made a standard that is either followed or not followed.

Ditto the numerous non-official standards, like MSN's robots.txt crawl-delay:, the meta commands like Google and Yahoo's noarchive (still a defacto standard), and a range of old Altavista specific commands (noimageindex & noimageclick, remember them? Do they even still work or matter?).

I would hope that out of this session we (meaning those who provide crawlers with content) could start a dialogue with them (the crawlers) on what things we would all like to see come to fruition, and preferably some standards that all can agree upon and documented somewhere in some common format.

This session was a great start, and for the future I would ideally like to see a webamster rep (from, as some ideas, SMA-* or SEMPO or Danny) represent the webmaster community at a pow-wow between the crawlers, webmasters and the W3C, to nut out some of these issues, and map a path for the future development of some standards and additional cralwer commands.

Ideally
projectphp is offline   Reply With Quote
Old 03-02-2005   #9
DaveMcClure500hats
 
Posts: n/a
Lightbulb future upgrade to nofollow -- Relevance Authority concept

nofollow seems like a bit of a hack... but i guess baby steps are headed in the right direction.

i think a better long-term solution is to figure out a basic authentication / trust mechanism, along with some positive/negative scoring data and tagging, and then use this to create a Relevance Authority standard.

ideally, you'd like any online service to provide a proxy for relevance against an identity, and then enable feedback scoring mechanisms to flow from there.

to get started, a basic version of this might be provided by a hosted service like TypePad (which already has TypeKeys), and then build from there.

- dmc
  Reply With Quote
Old 03-02-2005   #10
Nacho
 
Nacho's Avatar
 
Join Date: Jun 2004
Location: La Jolla, CA
Posts: 1,382
Nacho is a splendid one to beholdNacho is a splendid one to beholdNacho is a splendid one to beholdNacho is a splendid one to beholdNacho is a splendid one to beholdNacho is a splendid one to beholdNacho is a splendid one to beholdNacho is a splendid one to behold
Quote:
Originally Posted by rustybrick
Q: Nacho asked, how do we authenticate your crawlers? Sometimes people spoof the crawler.
A: Tim Mayer said that you can authenticate via the IP address. Ask Jeeves agreed. Um, hire fantomasters's ip list.
Right! IPLists.com is also a great website to get search engine's ip addresses.

I was sitting next to Barbara Coll and we both looked at each other as.... uhhh? Both of us knowing that this information is not available from the search engines, so she asked out loud (missing microphone) "Where do get the IP addresses?"

Matt Cutts also responded that it would be a good idea to for Google and other search engines do publish a list.

As part of this Indexing Summit, I hope it should be clear that the search engines themselves are who publish a list of IP addresses so that webmasters and site owners can truly authenticate web crawlers. We can not and should not rely on 3rd party information.
Nacho is offline   Reply With Quote
Old 03-03-2005   #11
Mike
Member
 
Join Date: Jun 2004
Posts: 52
Mike is on a distinguished road
just want to log a vote heavily in favor of the content-nav, content-public, content-default proposed by yahoo. main reason I like it is for the content-nav div -- to flag nav as separate so it doesn't skew the keyword density of the page copy, etc.
Mike is offline   Reply With Quote
Old 03-04-2005   #12
TheotherTim
Official Y! Rep
 
Join Date: Jun 2004
Location: Palo Alto, CA
Posts: 27
TheotherTim will become famous soon enoughTheotherTim will become famous soon enough
Thanks Mike. It would be good to get some more feedback on this proposal. I may try to post the proposal somewhere so people who didnt attend the conference can comment on it.
Tim
TheotherTim is offline   Reply With Quote
Old 03-04-2005   #13
Marcia
 
Marcia's Avatar
 
Join Date: Jun 2004
Location: Los Angeles, CA
Posts: 5,476
Marcia has a reputation beyond reputeMarcia has a reputation beyond reputeMarcia has a reputation beyond reputeMarcia has a reputation beyond reputeMarcia has a reputation beyond reputeMarcia has a reputation beyond reputeMarcia has a reputation beyond reputeMarcia has a reputation beyond reputeMarcia has a reputation beyond reputeMarcia has a reputation beyond reputeMarcia has a reputation beyond repute
Quote:
to flag nav as separate so it doesn't skew the keyword density of the page copy, etc.
I thought that's what block-level analysis was for.

Quote:
I may try to post the proposal somewhere so people who didnt attend the conference can comment on it.
Tim
Tim, it could of course be posted here in our Yahoo forum altogether. Else if it's posted at the Yahoo blog you (or one of us) could start a thread with a link to it, and we can have a discussion on it here.
Marcia is offline   Reply With Quote
Old 03-04-2005   #14
Robert_Charlton
Member
 
Join Date: Jun 2004
Location: Oakland, CA
Posts: 743
Robert_Charlton has much to be proud ofRobert_Charlton has much to be proud ofRobert_Charlton has much to be proud ofRobert_Charlton has much to be proud ofRobert_Charlton has much to be proud ofRobert_Charlton has much to be proud ofRobert_Charlton has much to be proud ofRobert_Charlton has much to be proud ofRobert_Charlton has much to be proud of
Tim - I'd like to see the whole proposal too. I'd much rather respond to the whole thing, on a dedicated thread, than react to Mike's post here and pull this thread off topic.
Robert_Charlton is offline   Reply With Quote
Old 03-08-2005   #15
TheotherTim
Official Y! Rep
 
Join Date: Jun 2004
Location: Palo Alto, CA
Posts: 27
TheotherTim will become famous soon enoughTheotherTim will become famous soon enough
I will try to get the presentation up on the Yahoo Search blog in the next day or two.
Tim
TheotherTim is offline   Reply With Quote
Reply


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off