PDA

View Full Version : Method to Find "Related" Terms


randfish
12-17-2004, 08:07 PM
I'm wondering if there is a good method for helping to understand what a specific search engine would consider related keywords and related phrases given a particular keyword or phrase.

As humans, we can clearly come up with ways to relate subjects, etc. But I'm wondering if one could use Google, for example, to find out which phrases or words the search engine believes to be related.

For example, would it be possible to search Google for a given keyword phrase, say "african rhino", analyze the top 50 results pages for occurences of other keyword phrases like "rhinoceros" and deduce that because "african rhino" and "rhinoceros" share all but 20 results pages, Google considers these terms related.

Could we create degrees of relativism from this, for example:
"African Rhino" 7150 results | "African Rhino" Rhinoceros 7130 results = 99.7%
"African Rhino" 7150 results | "African Rhino" Conservation 997 results = 13.9%
"African Rhino" 7150 results | "African Rhino" Hunter 1260 results = 17.6%

Do these numbers have any meaning, or am I barking up the wrong tree?

orion
12-18-2004, 12:05 AM
Hi, randfish


Related Searches

Google has " ~" search capabilities and suggests the following

"You may want to search not only for a particular keyword, but also for its synonyms. Indicate a search for both by placing the tilde sign ("~") immediately in front of the keyword."

"For example, to search for food facts as well as nutrition and cooking information, use"

~food ~facts

Note that the general query Q isbe of the form

Q = ~k1 ~k2….....~kn


C-Index Co-Occurrence Analysis

Let

k1=”African Rhino” (with quotes; i.e. EXACT mode)
k2=Rhinoceros, k2=Conservative or k2=Hunter

A c12-index analysis (in ppt) for queries of the form Q = k1 + k2 follows

For Q1, c12-index = 7.71 with a Salton Index = 0.0877
For Q2, c12-index = 0.03 with a Salton Index = 0.0019
For Q3, c12-index = 0.04 with a Salton Index = 0.0026

The terms are losely co-occurring.

On-Topic Analysis

On-Topic Analysis for “African Rhino” for the top 50 results indicates that the corresponding on-topic data structure does not appeal that much to any of the above k2 candidate terms, at least in Google.


I hope this help.


Orion

glengara
12-19-2004, 06:13 AM
You might also try the AdWords keyword suggestion tool....

https://adwords.google.com/select/KeywordSandbox

mesadynamics
01-04-2005, 02:21 PM
For example, would it be possible to search Google for a given keyword phrase, say "african rhino", analyze the top 50 results pages for occurences of other keyword phrases like "rhinoceros" and deduce that because "african rhino" and "rhinoceros" share all but 20 results pages, Google considers these terms related.

If you have (or have access to) a Macintosh, you can use theConcept (http://www.mesadynamics.com/theconcept.htm) to do a statistical analysis exactly as you describe: analyze the top results pages from Google and see what the most significant words and phrases are on those pages.

For example, according to theConcept, the most significant keyword pairs in the top 50 pages from Google searching "african rhino" are:

black rhino
white rhino
african rhino
rhino horn
rhino conservation
south africa
...

As the author of the program, I'm a bit biased in its usefullness for thematic keyword analysis, but we do have customers using it for exactly that.

orion
01-04-2005, 02:38 PM
Great! I only have Win XP, but happy if I can have an evaluation copy to review.


Orion

glengara
01-04-2005, 03:16 PM
Sounds interesting, does it use the Google API?

randfish
01-04-2005, 05:41 PM
If it had a version for XP, I would definitely give it a shot.