Search Engine Watch
SEO News

Go Back   Search Engine Watch Forums > General Search Issues > Search Industry Growth & Trends
FAQ Members List Calendar Forum Search Today's Posts Mark Forums Read

Closed Thread
 
Thread Tools
Old 04-13-2005   #1
rustybrick
 
rustybrick's Avatar
 
Join Date: Jun 2004
Location: New York, USA
Posts: 2,810
rustybrick has much to be proud ofrustybrick has much to be proud ofrustybrick has much to be proud ofrustybrick has much to be proud ofrustybrick has much to be proud ofrustybrick has much to be proud ofrustybrick has much to be proud ofrustybrick has much to be proud of
Coke vs. Pepsi Challenge for Search Engines

Have you ever participated in a taste test where you were suppose to guess if a white cup of cola was either Pepsi or Coke? The point behind the test was to pull marketing and branding out of the equation and look at the raw materials. Is Pepsi better then Coke based on a non biased taste test.

How about white labeling search results from Google, Yahoo!, MSN, and Ask Jeeves and asking a group of individuals which results are better. They will not know which results are from which engines. All they will know are the results.

Wonder which search engine will be deemed as most relevant?
rustybrick is offline  
Old 04-13-2005   #2
Qal
Clueless Newbie!
 
Join Date: Mar 2005
Location: /usr/bin
Posts: 86
Qal is on a distinguished road
It a subjective matter I believe. As you (and we all) know that the results depend completely on the search term (keyword).

If you search for a Corporate Entity, business or a business related term on MSN, you're likely to recieve more relevant results than google and almost similar to yahoo.

If you're looking for detailed information on a general search term like 'warez', 'mcafee_v3.exe' or 'who moved my cheese ebook', google is your best bet to supply you all the junk, which usually isn't useful.

Yahoo, as always, delievers neither relevant, nor irrelevant results. Intrestingly, I found yahoo to spider newer websites more, much more quickly than Google or MSN.

Overall, I think, it would be wise to use a metasearch engine that pulls results from these 3 engines. Moreover, those engines nevertheless get all the hits they deserve, either by you directly or by the metasearch engine you use.

Thusly, theres no point in testing results the coke/pepsi way. Why not get the best of it all.

But thats just my opinion. I already said, its a subjective issue. ;]

Last edited by Qal : 04-13-2005 at 01:38 PM.
Qal is offline  
Old 04-13-2005   #3
St0n3y
The man who thinks he knows something does not yet know as he ought to know.
 
Join Date: Jun 2004
Location: Here. Right HERE.
Posts: 621
St0n3y is a name known to allSt0n3y is a name known to allSt0n3y is a name known to allSt0n3y is a name known to allSt0n3y is a name known to allSt0n3y is a name known to all
Rustybrick,

I think that is a great idea for relevance testing. You could do it a couple of ways:

1) have pre-determined search phrases that can have multiple meanings (or purposes) depending on the intent of the search, and see how each testee perceives relevance based on what they *think* they are looking for.

2) have pre-determined search phrases that can be both information and commercial, and, again, let the testee perceives relevance.

3) let the testee perform their own searches with side-by-side results (unbranded, of course) and let relevance be determined that way.

The results would have to be formatted to appear the same and not in any format that the search engines currently output (that might sway opinions, if the results looked a certain way).

It would be very interesting to see the outcome.
St0n3y is offline  
Old 04-13-2005   #4
rustybrick
 
rustybrick's Avatar
 
Join Date: Jun 2004
Location: New York, USA
Posts: 2,810
rustybrick has much to be proud ofrustybrick has much to be proud ofrustybrick has much to be proud ofrustybrick has much to be proud ofrustybrick has much to be proud ofrustybrick has much to be proud ofrustybrick has much to be proud ofrustybrick has much to be proud of
Of course its subjective. But each search engine's prime goal is to make their results the most relevant for the searcher.

So why not test it some how?

I would love to see the search engines come together and conduct this type of contest.
rustybrick is offline  
Old 04-13-2005   #5
St0n3y
The man who thinks he knows something does not yet know as he ought to know.
 
Join Date: Jun 2004
Location: Here. Right HERE.
Posts: 621
St0n3y is a name known to allSt0n3y is a name known to allSt0n3y is a name known to allSt0n3y is a name known to allSt0n3y is a name known to allSt0n3y is a name known to all
Why wait for the search engines to get involved?
St0n3y is offline  
Old 04-13-2005   #6
Qal
Clueless Newbie!
 
Join Date: Mar 2005
Location: /usr/bin
Posts: 86
Qal is on a distinguished road
Quote:
Originally Posted by rustybrick
Of course its subjective. But each search engine's prime goal is to make their results the most relevant for the searcher.

So why not test it some how?

I would love to see the search engines come together and conduct this type of contest.
No matter how much each engine works to increase result relevancy, not one engine would be better than the other two. After all, we're talking about three giants who are always best at what they do.

However, most times, yahoo and google return almost the same results (links) with different description which would make difficult for a examiner to differentiate between a google or yahoo result, they would collide more often. MSN is a bit too different, not sure how they sort results, but I like it over the other 2.

But I still prefer a metasearch engine as a user.
Qal is offline  
Old 04-13-2005   #7
rustybrick
 
rustybrick's Avatar
 
Join Date: Jun 2004
Location: New York, USA
Posts: 2,810
rustybrick has much to be proud ofrustybrick has much to be proud ofrustybrick has much to be proud ofrustybrick has much to be proud ofrustybrick has much to be proud ofrustybrick has much to be proud ofrustybrick has much to be proud ofrustybrick has much to be proud of
Quote:
Originally Posted by St0n3y
Why wait for the search engines to get involved?
I agree. Anyone want to develop software for this?
rustybrick is offline  
Old 04-13-2005   #8
Everyman
Member
 
Join Date: Jun 2004
Posts: 133
Everyman is a jewel in the roughEveryman is a jewel in the roughEveryman is a jewel in the rough
Yahoo and Google may seem like they return the same results, but the links are 80 percent different. This isn't speculation -- I kept track of the overlap between the top 100 results from Yahoo and the top 100 from Google. The average overlap was only 20 percent, and it rarely was more than 25 percent or less than 15 percent. This was from about 1,000 searches a day over nine months.

The way to do this comparison test is to grab the top five to ten results from each engine, purge the duplicates (while keeping track of which links were duplicated by which engine), randomize the result set, and present ten results to the user in random order. Ask the user to rank the best two or three results based on the quality of the links. Keep score behind the scenes.

The search engines won't touch this sort of thing. The money is in advertising and branding. They don't want to work at relevancy; all they want are some public relations buzzwords about relevancy. Besides, you can't trust the engines -- they'd rig the results.
Everyman is offline  
Old 04-14-2005   #9
Qal
Clueless Newbie!
 
Join Date: Mar 2005
Location: /usr/bin
Posts: 86
Qal is on a distinguished road
Quote:
Originally Posted by Everyman
Yahoo and Google may seem like they return the same results, but the links are 80 percent different. This isn't speculation -- I kept track of the overlap between the top 100 results from Yahoo and the top 100 from Google. The average overlap was only 20 percent, and it rarely was more than 25 percent or less than 15 percent. This was from about 1,000 searches a day over nine months.

The way to do this comparison test is to grab the top five to ten results from each engine, purge the duplicates (while keeping track of which links were duplicated by which engine), randomize the result set, and present ten results to the user in random order. Ask the user to rank the best two or three results based on the quality of the links. Keep score behind the scenes.

The search engines won't touch this sort of thing. The money is in advertising and branding. They don't want to work at relevancy; all they want are some public relations buzzwords about relevancy. Besides, you can't trust the engines -- they'd rig the results.
I believe, thats what a metasearch engine does. Although, there are not many metasearch engine to provide google, yahoo and msn results together, and even if they do, once their search volume increases, they'll surely drop google and yahoo off their system.

I've been beta testing an engine since last month, and they pull results from G, Y and Msn, purges dupes, and sorts them intelligently, exactly like you said.

By default, they pull 10 results from G, Y and M, total 30, purges dupes, which usually brings the number down to 20-21, 23 is the average and sorts them intelligently. Although I'm cluless about their sorting method, but most times it displays super-relevant results which all 3 engines wont, and sometimes (not usually) the best result is found on Number 10 of the first page. But as I said, they're still working..

Sorting is the ONLY problem you'd face while making a tool like this. G, Y and M has huge indexes doubtlessly, but sorting results is the key as we all understand.

With a little more effort, they could hide/show result source or make it the way we want. However, I'm still unsure if they'd want me to disclose their URL publicly as most things work but they're still under development.

Edited to Add: Additionally, the metasearch engine (I'm talking about) used to pull Teoma results as well, sometime back. But the results were pretty outdated, 6 month old index probably, hence they ditched Teoma.

Ask pulls secondary results from Teoma, So, they might be able to include 'AskJeeves' in this CPSE Test (Coke-Pepsi-Search Engine Test) with a little more effort.

But the question is, WILL THEY DO IT? Well, if a senior member of this board, the thread starter (an awesome blogger) perhaps would contact them personally, they should!

Last edited by Qal : 04-14-2005 at 12:32 AM.
Qal is offline  
Old 04-14-2005   #10
rustybrick
 
rustybrick's Avatar
 
Join Date: Jun 2004
Location: New York, USA
Posts: 2,810
rustybrick has much to be proud ofrustybrick has much to be proud ofrustybrick has much to be proud ofrustybrick has much to be proud ofrustybrick has much to be proud ofrustybrick has much to be proud ofrustybrick has much to be proud ofrustybrick has much to be proud of
The only way I can see Google and Yahoo participating is by allowing us to use their API. I am fairly confident MSN won't mind us using their RSS feed. Also, I am pretty sure Ask Jeeves is confident in their relevancy and will provide a feed or permission to scrape the engine to conduct such a test.

So I guess I can build a white labeled engine to do this. Just need the time. Possibly, I can get one of my programmers started on it in two weeks.
rustybrick is offline  
Old 04-14-2005   #11
dannysullivan
Editor, SearchEngineLand.com (Info, Great Columns & Daily Recap Of Search News!)
 
Join Date: May 2004
Location: Search Engine Land
Posts: 2,085
dannysullivan has much to be proud ofdannysullivan has much to be proud ofdannysullivan has much to be proud ofdannysullivan has much to be proud ofdannysullivan has much to be proud ofdannysullivan has much to be proud ofdannysullivan has much to be proud ofdannysullivan has much to be proud of
Quote:
But the question is, WILL THEY DO IT? Well, if a senior member of this board, the thread starter (an awesome blogger) perhaps would contact them personally, they should!
Don't get your hopes up. My In Search Of The Relevancy Figure article from 2002 covers all the types of issues being raised here -- how do you measure, shouldn't the search engines help develop good testing and so on. If you haven't read it and are interested in this topic, honestly, I think it's a good starting place. It concludes:

Quote:
Ultimately, if the search engines fail to come up with an accepted means of measuring relevancy, they are going to continue to be measured by one-off "ego searchers" or rated anecdotally.
Again, this was 2002. Nearly three years later, none of them have jumped in. Google wouldn't want to, because I think it would show that it is nearly on par with Yahoo, either slightly higher or lower in relevancy. Yahoo might want to -- and in fact in the guise of Inktomi -- has done so on its own before. Maybe it will again, but without the others signing on, you no doubt question the results that seem to show it doing well. MSN down the line might sign-on, but right now, I think it would come in as a distance second behind Google, Yahoo and Ask Jeeves.

Still, OK, I'll stay optimistic. These guys all want to attract people, and just rolling out new features and not offering up proof of relevancy that others can see is really lacking. I said before they should do it; maybe they will.

The other issue is that the focus on web search relevancy is short-sighted. At some point, we're going to hit vertical results for many, many more of our queries. So it's not just the relevancy of web search you want -- what about the verticals, as well.

What is relevancy if a past forum thread where we covered some of this stuff before. I noted the above In Search Of The Relevancy Figure in that one, but I also listed a lot of related reading that I'll repost here, as well:

Quote:
AltaVista, Overture Speak Up About Perfect Page Test and one example of a very specific type of relevancy test we did, The Search Engine "Perfect Page" Test.

All those are from the end of 2002. Inktomi, Google Win In Recent Relevancy Test from 2003 looks at testing VeriTest did, and from the middle of last year, this write-up looks at testing Vividence did.

The Vividence findings, the more recent relevancy test we have out there, are also touched on in Delving Deep Inside the Searcher's Mind, Inside The Searcher's Mind - Live from SES San Jose and Others Close the Tech Gap with Google.
And here's the most recent, substantial testing we've had out recently:

Survey: Google Still Leads, But Competitors Closing Gap

Last edited by dannysullivan : 04-14-2005 at 09:18 AM.
dannysullivan is offline  
Old 04-14-2005   #12
Qal
Clueless Newbie!
 
Join Date: Mar 2005
Location: /usr/bin
Posts: 86
Qal is on a distinguished road
Well, yes those articles wiggles this theory a bit, however doing this all over again as an SEW Study would be fun and more accurate perhaps, as most people here are SE techies whereas the study conducted on 2000 people by Keynote might include novoice users who're simply amazed by google and never bothered to concenterate more on other engines at all.

But thats just me, not sure if rustybrick would want work on it anymore.

Last edited by dannysullivan : 04-14-2005 at 09:19 AM. Reason: Fixed incorrect link mentioned in post above, thanks!
Qal is offline  
Old 04-14-2005   #13
rustybrick
 
rustybrick's Avatar
 
Join Date: Jun 2004
Location: New York, USA
Posts: 2,810
rustybrick has much to be proud ofrustybrick has much to be proud ofrustybrick has much to be proud ofrustybrick has much to be proud ofrustybrick has much to be proud ofrustybrick has much to be proud ofrustybrick has much to be proud ofrustybrick has much to be proud of
Quote:
Originally Posted by Qal
But thats just me, not sure if rustybrick would want work on it anymore.
I think its still worthwhile. It would be interesting to see what SEMs think is more relevant, IMO.
rustybrick is offline  
Old 04-14-2005   #14
dannysullivan
Editor, SearchEngineLand.com (Info, Great Columns & Daily Recap Of Search News!)
 
Join Date: May 2004
Location: Search Engine Land
Posts: 2,085
dannysullivan has much to be proud ofdannysullivan has much to be proud ofdannysullivan has much to be proud ofdannysullivan has much to be proud ofdannysullivan has much to be proud ofdannysullivan has much to be proud ofdannysullivan has much to be proud ofdannysullivan has much to be proud of
We could do our own test, but I can't say enough how difficult this is.

It's already been mentioned about subjectivity. One reason we used to do the "Perfect Page" test was to try and simplifying testing by picking queries where people would nearly universally agree that a particular page should appear. Look for "us patents," for example, and you expect the US Patent office to be at least one of the top results. But tests like that have their own difficulties, as well.

Now that MSN is finally out of the box, it's probably time for Chris, Gary and I to concoct some new testing to see how folks stand. But I know it still won't be a battery of tests with hundreds of queries that would satisfy everyone, much less ourselves. Then again, if the search engines refuse to get together on that type of testing, this is what they get stuck with -- sampling test, ego searches, anecdotal testing that I covered in my article above as harmful.

FYI, our meta search winner for the recent SEW Awards was Jux2.com. Check it out for an easy way to see exactly what a particular search engine overlaps with -- and doesn't.
dannysullivan is offline  
Old 04-14-2005   #15
rustybrick
 
rustybrick's Avatar
 
Join Date: Jun 2004
Location: New York, USA
Posts: 2,810
rustybrick has much to be proud ofrustybrick has much to be proud ofrustybrick has much to be proud ofrustybrick has much to be proud ofrustybrick has much to be proud ofrustybrick has much to be proud ofrustybrick has much to be proud ofrustybrick has much to be proud of
Quote:
Originally Posted by dannysullivan
We could do our own test, but I can't say enough how difficult this is.
I am not saying it won't be difficult.

This is what I am planning.

(1) Build a white page with a search box and a "search" button.
(2) Someone enters in a search query and clicks "search".
(3) Randomly, it fetches results from one of the four engines (G/Y/M/A) <-- or should I randomly mix up results from all four and place 10 of them on the same screen?
(4) Next to the results, we ask a user to rate the "relevancy" of the result from 1 - 5 (or some scale). This is purely subjective, but I think that is fine.
(5) Collect data for all test cases
(6) Report finding X days, weeks, months later.

It is not too hard technically to do. And the subjectivity (objectivity) of the answer to the "most relevant search engine" is placed on the test subjects. I don't care if they are wrong about what is "relevant", there is no right answer besides for the answer the test subject is giving.

Anyone see any issues with this?
rustybrick is offline  
Old 04-14-2005   #16
Qal
Clueless Newbie!
 
Join Date: Mar 2005
Location: /usr/bin
Posts: 86
Qal is on a distinguished road
Quote:
It's already been mentioned about subjectivity. One reason we used to do the "Perfect Page" test was to try and simplifying testing by picking queries where people would nearly universally agree that a particular page should appear. Look for "us patents," for example, and you expect the US Patent office to be at least one of the top results. But tests like that have their own difficulties, as well.
Yea, Although its an arduous job, but Perfect Page Test seems more feasible.

Quote:
Then again, if the search engines refuse to get together on that type of testing, this is what they get stuck with -- sampling test, ego searches, anecdotal testing that I covered in my article above as harmful.
So True. In this theory, aprt from sorting, this would be another contemplative issue I presume.

Quote:
FYI, our meta search winner for the recent SEW Awards was Jux2.com. Check it out for an easy way to see exactly what a particular search engine overlaps with -- and doesn't.
Jux2 covers Google, Yahoo and Ask excluding MSN whereas the engine I'm talking about, uses MSN primarily. I seriously believe that including MSN in this test would reveal surprising results.
---
Quote:
(3) Randomly, it fetches results from one of the four engines (G/Y/M/A) <-- or should I randomly mix up results from all four and place 10 of them on the same screen?
But the point is, how'd you deal with duplicates? For Instance, if the most relevant result is a duplicate of Y and G, which engine would you mark as winner? Both?

Thusly, your method certainly won't work. The Perfect Page test would be more feasible, as I said earlier.
Qal is offline  
Old 04-14-2005   #17
rustybrick
 
rustybrick's Avatar
 
Join Date: Jun 2004
Location: New York, USA
Posts: 2,810
rustybrick has much to be proud ofrustybrick has much to be proud ofrustybrick has much to be proud ofrustybrick has much to be proud ofrustybrick has much to be proud ofrustybrick has much to be proud ofrustybrick has much to be proud ofrustybrick has much to be proud of
Quote:
Originally Posted by Qal
But the point is, how'd you deal with duplicates? For Instance, if the most relevant result is a duplicate of Y and G, which engine would you mark as winner? Both?
So I will fetch all top 10 results from all 4 engines. If the domain name matches in some or all results, then I note it in the data collection, mark the ranking number associated with the engine. Then plot of the results based on ranking of the page and "relevancy score" by the test users.
rustybrick is offline  
Old 04-14-2005   #18
DarkMatter
Master Blaster
 
Join Date: Feb 2005
Location: New Jersey,USA
Posts: 137
DarkMatter is on a distinguished road
You could just do a simple version of this without having to create any kind of new software.

Heres what I'm thinking: create a list of the most frequent types of searches that people do (local searches, information seeking, product/service related searches..etc), then use wordtracker or your preferred keyword tools to pick out some of the most popular keyword searches in each category.

Once you have this data, simply do the search manually on each of the search engines and copy/paste the search results into a generic web page that doesn't give any indication of where they came from (reformat the results if necessary to remove clues). Add an index page that lists and organizes these pages, and a simple voting script on each of the generic SERP pages.

Not very elegant and a lot of work but just a thought.
DarkMatter is offline  
Old 04-14-2005   #19
St0n3y
The man who thinks he knows something does not yet know as he ought to know.
 
Join Date: Jun 2004
Location: Here. Right HERE.
Posts: 621
St0n3y is a name known to allSt0n3y is a name known to allSt0n3y is a name known to allSt0n3y is a name known to allSt0n3y is a name known to allSt0n3y is a name known to all
Quote:
However, most times, yahoo and google return almost the same results (links) with different description which would make difficult for a examiner to differentiate between a google or yahoo result,
This is a good point, but one that goes to the heart of not just relevancy of results, but the appearance of relevance. Let's say that Google, Yahoo and MSN all show the same site in the number one position. Each SERPs page, however, displays the listing differently. It would be interesting to see if the users determine one engine to be more relevant simply based on the output in the search results, though the site is the same.

Quote:
This is what I am planning.

(1) Build a white page with a search box and a "search" button.
(2) Someone enters in a search query and clicks "search".
(3) Randomly, it fetches results from one of the four engines (G/Y/M/A) <-- or should I randomly mix up results from all four and place 10 of them on the same screen?
(4) Next to the results, we ask a user to rate the "relevancy" of the result from 1 - 5 (or some scale). This is purely subjective, but I think that is fine.
(5) Collect data for all test cases
(6) Report finding X days, weeks, months later.
Here is what I envisioned:

(1) Build a white page with a search box and a "search" button.
(2) Someone enters in a search query and clicks "search".
(3) the results page produces the #1 listing from each of the top four engines (all side-by-side accross the top of the page without any engine branding)
(4) Below the results, we ask a user to rate the "relevancy" of EACH result from 1 - 5.
(5) submit their scores and the process repeats wtih the #2 listing from each engine, and so on.
(5) Collect data for all test cases
(6) Report finding X days, weeks, months later.

Reporting could show:

a) which engine recieved the highest relevance score over all
b) which engine recieved the highest relevance score for individual search positions.

example: Say Google was considered most relevant overall, but MSN was selected to have the most relevant #1 listing, and so on.

The lower-tech version by DarkMatter would also work, simply build the test with pre-determined searches, and pull the results manually and place them into the "test". Let the user then rate the results accordingly.
St0n3y is offline  
Old 04-14-2005   #20
rustybrick
 
rustybrick's Avatar
 
Join Date: Jun 2004
Location: New York, USA
Posts: 2,810
rustybrick has much to be proud ofrustybrick has much to be proud ofrustybrick has much to be proud ofrustybrick has much to be proud ofrustybrick has much to be proud ofrustybrick has much to be proud ofrustybrick has much to be proud ofrustybrick has much to be proud of
Any other ideas. All good ideas so far...

Just want them all flushed out before I begin working on anything.
rustybrick is offline  
Closed Thread


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off