PDA

View Full Version : Teoma's Search Technology Summed Up


rustybrick
06-02-2004, 12:52 AM
I thought I try to abbreviate Teoma's search technology with something I wrote back in December.

Teoma adds a new layer of "authority" to search results through something they call "Subject-Specific Popularity." Google's PageRank, simply explained, ranks pages based on the quality and the number of inbound links to a site. Teoma ranks sites based on related communities of sites that are "organically organized" and link to each other. It then determines which sites are most relevant based an authority factored, that is where Subject-Specific Popularity comes into play. Subject-Specific Popularity determines the authority of a site based on the number of pages that link to a page within the same subject. Teoma provides a nice analogy to why this is important. They write, "picture yourself in your garage, in front of the opened hood of your severely out-of-commission pick-up truck. You need help with this major repair, and you can either ask your uncle, who owns two cars but has never held a wrench in his life and happens to be visiting (similar to using other leading search technologies) or you could phone your best friend, who has a degree in applied mechanics and builds automobiles from the ground up in his spare time (similar to Subject-Specific Popularity). The choice is quite clear."

When Teoma 2.0 was released it provided improved relevancy, more accurate communities, spell checking, "Dynamic Descriptions", more advanced search tools and an expanded index. Ask Jeeves reports an increased "user pick-rate" of 22% and a site abandonment decrease of 28% since the upgrade. In addition, Teoma received a relevancy grade of "A" from Search Engine Watch, adding them to the elite group of search engine that include Google, Yahoo and MSN. By improving Teoma's analysis of "Communities" they were able to increase the relevancy of pages by better evaluating authoritative pages. In addition, the "refine" search option found on Teoma.com enables searchers to easily narrow down their search results. Many search engines have Web-based spell check, Teoma added this in its 2.0 version. Teoma 2.0 added other enhancements and features as well as increased its index by over 500 million URLs.

AussieWebmaster
06-05-2004, 09:25 PM
That was added to the bookmarks

AussieWebmaster
06-05-2004, 09:26 PM
Actually I think I may have to ask for reprints of the original to help some affiliates get the picture... well written mate.

seobook
06-06-2004, 10:59 AM
Mike Grehan did a cool 16 page report discussing the technologies used w Teoma.

colorful pictures and all...

http://www.searchguild.com/topic_distillation.pdf

rustybrick
06-06-2004, 11:39 AM
He has some of the best stuff. I always enjoy reading his work.

Igor
06-09-2004, 03:16 PM
I just noticed a big bug on Teamo. The short version is: Google has a system for ignoring the mirror sites that once gummed up its results; Teoma appears not to have a fix for this annoying practice. For example, search Teoma for the term "company names" (w/o quotes) and the "Resources" result column lists ten websites, seven of them being the identical mirror sites listed below:

ahundredmonkeys.com
naming-company.com
namingcompany.com
name-branding.com
companynamesbusiness.com
hundredmonkeys.com
onehundredmonkeys.com


http://www.snarkhunting.com/2004/06/teoma-company-names-corporate-identity.html

AussieWebmaster
06-09-2004, 03:20 PM
I will have to look at those a little... wonder if they have a robots.txt file to push the others away knowing teoma allows it...

AskJeevesRep
06-11-2004, 10:14 AM
Igor:

You mention that Teoma has a big bug w.r.t mirror sites and that "company names" (without quotes) query brings up 10 links under "Resources" in which 7 are mirror sites.

As AskJeevesRep let me try to clarify: Teoma brings up 6 links under "Resources", not 10 as you cited, for the query 'company names' (without quotes). Further, only 2 are mirror sites in there: www.naming-company.com and www.name-branding.com, not 7. The 'hundredmonkeys' sites you mention are not in the Teoma Resources links we present.

We do have mirror detection in place; further, we are constantly working towards improving our mirror detection and other algorithmic capabilities.

Thanks,
AskJeevesRep

David Wallace
06-11-2004, 02:29 PM
Welcome, AskJeevesRep!

This might be a good time ask if Ask Jeeves/Teoma offers a way for people to report spam such as mirror sites in the results or even domain spammers who hog up the SERPs with different domains but the same company/service and if there is such a way, does AJ take an active role in investigating such occurrences or do they rely only on automated filters and algorithms?

Terry Plank
06-11-2004, 03:24 PM
I'm interested in the reporting spam process as well. I haven't found much response for what I have reported to one search engine for what I thought was an obvious violation of their Guidelines.

Questions: Does AskJeeves/Teoma want reporting? If so, what process would AskJeeves/Teoma suggest we follow?

rustybrick
06-13-2004, 11:12 AM
Maybe these questions are not getting answered by the AskJeevesRep yet because it is slightly off the topic of the title of this thread.

"Teoma's Search Technology Summed Up"

Maybe we should start a new thread named "How Teoma/Ask Handles Spam"

Anyway, to get this back on topic, lets discuss how Teoma 2.0 uses hubs and authorities to build defined Web communities. In reality, it is Teoma's ability to discover these communities that makes them unique.

Let me just clarify my understanding, based on Mike Grehan's papers, as to what is an authority and hub in the case of search technology.

An authority, in simplistic terms, is a page that is linked to by many pages.
A hub, in simplistic terms, is a page that links to many authority pages.

By understanding the Web as groups of communities, and then defining hubs and authorities, Teoma is able to do a fairly good job returning relevant and interesting results. Not only, Teoma allows you to drill down into a smaller sub-set off a specific "web community" through the "refine" option on the right column.

Igor
06-13-2004, 06:36 PM
In response to the AJ rep,they took care of that specific result in the resources results (none of the mirrors remain) but the problem exists on Ask.com in the main results as well. A search for "naming consultants" returns Namebase.com and its mirror sites as results 11 thru 16.

Dodger
06-13-2004, 07:10 PM
By understanding the Web as groups of communities, and then defining hubs and authorities, Teoma is able to do a fairly good job returning relevant and interesting results. Not only, Teoma allows you to drill down into a smaller sub-set off a specific "web community" through the "refine" option on the right column.

Would this be similar to what TouchGraph (http://www.touchgraph.com/) is doing with their Google Browser (http://www.touchgraph.com/TGGoogleBrowser.html)? This will allow you to see linked communities visually.

rustybrick
06-14-2004, 12:58 AM
Would this be similar to what TouchGraph (http://www.touchgraph.com/) is doing with their Google Browser (http://www.touchgraph.com/TGGoogleBrowser.html)? This will allow you to see linked communities visually.

I took a quick look at the links you provided above, I have seen them before. It doesn't seem to work the way the "refine" option does at Teoma. This tool just seems to plot some "related" sites but not "community" sites.

But the idea is there.

Dodger
06-14-2004, 01:13 AM
Yep. I just mentioned that tool because of the "colorful pictures" in the pdf that SEObook mentioned http://www.searchguild.com/topic_distillation.pdf which reminded me of it. It seemed like the same principal -- yet probably arrived at by different means.

I am still at a loss as to what "related" really means to Google I guess. :confused:

rustybrick
06-14-2004, 08:00 AM
I am still at a loss as to what "related" really means to Google I guess. :confused:

Sounds like a good question in the Google Web Search forum. ;)

pleeker
06-16-2004, 02:05 AM
That was added to the bookmarks

No kidding, great stuff Rustybrick. Thanks for that.

Now, is there a "Bookmark this thread" function here in the Forum, or do I have to use my browser bookmarker? Hmmmm. Still looking...

littleman
06-29-2004, 02:21 PM
Except for some bumpy areas it seems to work well, one thing I'd like to see is associative words being worked into the link pop algo.

For instance a page which talks about DC power which would boost a page about batteries. Maybe that is in there, but I couldn't see any evidence of it.

FWIW, I don't believe google has used raw PR in a while, but is also applying a type of "subject-specific popularity." I am sure this is the trend, with the SEs moving more and more this direction, and applying a type of theme/contextual filtering of links to counteract the growing 'link economy'.

seobook
06-29-2004, 02:30 PM
FWIW, I don't believe google has used raw PR in a while, but is also applying a type of "subject-specific popularity." I am sure this is the trend, with the SEs moving more and more this direction, and applying a type of theme/contextual filtering of links to counteract the growing 'link economy'.

I agree that Google will do some trending in that direction, but for the most part *any link goes* is currently rather effective to rank well in Google. link text is super important. page theme is not yet really.