View Full Version : Competitiveness of a Search Term
Dave Hawley
12-07-2004, 06:02 AM
I constantly see people referring to Google's Results 1 - 10 of about x where x is Google estimate of the search term as an indication of how competitive a term is.
IMO, this is totally flawed as not only are Google's estimates way off, it also only tells us how popular a search term is, not how competitive it is.
I'm curious how others judge the competitiveness of a search term?
glengara
12-07-2004, 06:16 AM
Keyphrase in " ", AdWords, Overture...
DaveN
12-07-2004, 06:40 AM
i think "ART" is pretty hard to crack and "microsoft" as keywords...
DaveN
I agree that the number of search results only gives a very vague popularity figure, and really not much about the competitiveness of the search term.
My take on it is that if I am competing for a first page spot there are only ten other competitors, those currently on the top page for that term, one of whom I will have to knock off to get onto that page.
The only information available as to how hard it is going to be to gain that position can IMO only be found by researching the top pages.
seobook
12-07-2004, 08:42 AM
The only information available as to how hard it is going to be to gain that position can IMO only be found by researching the top pages.
prettymuch just research the anchor text and unique C class IP addresses of backlinks (using tools or an engine that shows more details than what Google is showing).
powerofeyes
12-07-2004, 10:20 AM
The way we track competitiveness
1. Research using Adwords, Overture, wordtracker and comparing the results,
2. Checking log files of the sites ranking and see what are most popular terms and the most popular keyword combinations and research based on that,
3. Checking the top pages ranking for that term, where they originate how it ranks for that query,
4. Use advanced searches like allintitle, intitle, allinanchor, allintext etc to find the sites ranking for that term and research based on that,
randfish
12-07-2004, 12:55 PM
http://socengine.com/seo/tools/keyword-difficulty-tool.html
This tool I made measures:
- Top 3 Bids @ Overture
- # of Times Searched Last Month
- Strength of top 10 Competitors' Site PR
- Strength of top 10 Competitors' Page PR
- Strength of top 10 Competitors' Backlinks
- Strength of top 10 Competitors' Size
- # of words in phrase
- # of Search Results @ Google
- # of Results @ Google in Quotes
It then uses a scoring system to come up with a percentage answer (0% - easiest to 100% - most difficult)
Some examples:
new bmw 68.27%
mesothelioma 82.32%
university degree online 68.15%
barber shop seattle washington 38.64%
The scoring system isn't perfect yet, but maybe you can chip in and test - let me know what you think.
iamrussell
12-07-2004, 01:41 PM
That's a great tool!
One thing I noticed is that "Times Searched Last Month" seems to be grabbing the number from overture's key word suggestion tool (which is what I use for ranking a keyword). You should double that number because the overture network only gets roughly 50% of all searches.
KenEvoy
12-07-2004, 01:43 PM
Hi Dave,
Regarding your question...
I constantly see people referring to Google's Results 1 - 10 of about x where x is Google estimate of the search term as an indication of how competitive a term is.
IMO, this is totally flawed as not only are Google's estimates way off, it also only tells us how popular a search term is, not how competitive it is.
I'm curious how others judge the competitiveness of a search term?
I love lurking these forums and have been a huge fan of Danny and SEW since way back when.
As far as I know, we (SiteSell.com) were the very first company to introduce this concept, before any of the companies that now make this their business, and of whom WordTracker does it extremely well. WordTracker is the only company in this category in our short list of recommended resources to our Site Build It! customers, to supplement what our brainstorming part of Site Build it! does, because WT adds tremendous EXTRA value -- this is their bread and butter and they do it well.
The concept is this...
1) DEMAND is how often SURFERS search for a keyword. Surfers are the marketer's "PRE-customer," if you will. Therefore, if Web surfers search on "Caribbean" 100,000 times and only search on "Anguilla" 10,000 times, there is 10 times more DEMAND for "Caribbean" than for "Anguilla."
That is an important point for anyone planning to market about the Caribbean or Anguilla (or both) to know. Data like this already starts to shape, for example, the outline for a new site. Of course...
You also have to measure that AGAINST how hard it is to "rank" for each word. And that is where SUPPLY comes in. The more Web pages that exist about a certain keyword/phrase, the harder it will be to rank highly... more "competition." WordTracker calls it "Competition" and we call it "SUPPLY" in our courses and in Site Build It!, but it amounts to the same concept -- how hard it's going to be to score due to the number of pages that exist that already contain that keyword.
2) SUPPLY is, therefore, how many Web pages create content for a specific term. It's best to do that estimate with quotes around a multi-word keyword phrase because THAT is the single best way to estimate how many pages actually focus on that term.
For example, Google estimates that there are 255,000 pages about /anguilla beaches/ (/ = without quotes), but only 2,070 pages about "anguilla beaches" (WITH quotes). The latter number is a more realistic estimate since the former includes all kinds of pages that are far less targeted.
Dave, you're right when you say that the SUPPLY number is also a measure of how popular it is. It is a DIRECT indicator of how popular it is among Web site PUBLISHERS of content. But it is only an INDIRECT implication of how popular it MAY be amongst CONSUMERS of content (i.e., Web surfers).
And in fact, the two are not always perfectly aligned. There is nothing nicer than finding a great niche with lowish SUPPLY and solid DEMAND (yes, they still exist).
You can use this technique to find very profitable niches. And that brings me to the concept of "profitability."
3) PROFITABILITY
To oversimplify, the ratio of DEMAND to SUPPLY is a rough indicator of PROFITABILITY.
(We use a more complicated formula that weights demand and supply and adjusts the true profitability IN RELATION TO ALL OTHER KEYWORDS in a collection of keywords, but that's a wrinkle that is not important to the big picture of this discussion. I just mention it in case anyone says the simple ratio is overly simplistic -- it is -- I am only trying to address the central issue, which is Dave's doubts about what Google's SUPPLY data is worth to us as marketers, and how it fits into the bigger picture).
Dave, regarding your comment about "Google's estimates being way off," it does not really matter. What DOES matter is that their estimates are accurate, RELATIVE to all the other keywords that you are querying at Google. Once you start with an engine for a given set of keywords, stay with that engine.
You CAN get DEMAND from one source (ex., WordTracker via the feed it pulls from its sources) and SUPPLY from another source (ex., Google). As long as you do that consistently, the PROFITABILITY ratio of all words, relative to each other, will be fairly accurate. The ABSOLUTE numbers (as long as each is in a reasonable ballpark) is not as important as the ratios, especially in comparison to all the other keywords in a related set. So as long as your sources for DEMAND and then for SUPPLY are consistent, your PROFITABILITY results will be valid.
In case you want to see something really neat, we've just added a tremendous free search to our free e-commerce search tool, Search It!. You can enter up to 1,000 keywords at one time and receive SUPPLY data back in a nice tabular format for all 1,000. It uses the Google API, so you'll need your own key, but we explain how to get that. Just go to searchit.sitesell.com, click to the Search tool itself after reading the online help and then...
1) Pick "Competition" in the "KEYWORDS" section of STEP 1.
2) Then select "Google Multiple Keyword SUPPLY" in STEP 2. Read the online help for details of how to enter your 1,000 words to get your SUPPLY data in one clean sweep into STEP 3.
Simply follow the online instructions and you'll have a single table with Google SUPPLY data for all your keywords, with ONE search.
Search It! itself is very slick, but this is my single favorite search -- it removes a ton of drudgery.
The Google API uses a different database and/or set of configurations -- so its supply data is different from the feed from its actual "live" search results page delivers for each keyword. But that's OK... check it out and you will see that the supply numbers are quite close RELATIVE to all the other keywords in your related set of words. Since all your data is from the same source, your SUPPLY denominator is valid relative to each other and the RELATIVE PROFITABILITIES therefore are also valid.
So, as long as your DEMAND data is coming from one source (Overture, Wordtracker, etc.), you can build a fairly reliable set of keywords with SUPPLY, DEMAND and therefore PROFITABILITY.
The most important point? After you gather all this data for 200 keywords, say, you have to judge it all, using a human brain. Let me repeat that because I see too many people become slaves to the numbers. You have to use your human judgment to make the final decisions. If you don't...
There is definitely weirdness in some of the results you will pull back, likely generated by algorithms hitting your DEMAND resources for example, boosting them artificially. Some pretty strange words can appear with relatively high DEMAND and near-zero SUPPLY. You have to recognize that is not a REAL situation.
Another example -- If you use Overture's DEMAND tool (Keyword suggestion tool), you have to reorder some of their bizarre word orders within a multi-word keyword-phrase to make some sense out of the results, and base your SUPPLY research on that. You'll have to account for the fact that they don't pluralize (at least not most of the time). I'm wandering off a bit, now, so I'll bring it to a halt because you were seeking a "bigger picture answer."
Bottom line? As long as you apply the human filter at the end of this process, the SUPPLY-DEMAND-PROFITABILITY process is an excellent way to identify profitable niches about which to create content that ultimately turns into traffic and income.
Hope that helps.
All the best,
Ken Evoy
MOD EDIT: please, no 'sig files' per TOS (http://forums.searchenginewatch.com/faq.php) , see info on custom user titles.
randfish
12-07-2004, 02:40 PM
One thing I noticed is that "Times Searched Last Month" seems to be grabbing the number from overture's key word suggestion tool (which is what I use for ranking a keyword). You should double that number because the overture network only gets roughly 50% of all searches.
Russell - I understand your point, but the tool's purpose is more about showing the relative difficulty rather than the actual number of searches. For this purpose, doubling the number would simply make me cut the scores in half for the percentage measurement - not particularly valuable.
I'm glad you like the tool - please let me know if any of you have other suggestions - especially in the area of percentage balancing of the scoring.
Marcia
12-07-2004, 02:41 PM
Hey, big welcome to SEW Forums, Ken!
Supply, demand and profitability aside, one of the key things I look at, that I consider very important and to a great extent relies on personal evaluation within any market, is feasibility.
Three of the factors related to whether a search term is feasible to pursue:
1. Timing
When a site is brand new, it's generally simpler and quicker to rank for lesser terms until more content and/or inbound links are developed, and/or the site's optimization is further refined. What's reasonably competitive for a brand new site may be far less than what its potential may be as it ages.
2. Who's on First?
Sometimes it isn't how much demand or supply, but who the competition is. That can only be discerned by examining the SERPs and evaluating what techniques are being used to rank. There can be a search term with high demand and little numerical competition, but there are very aggressive people operating in that space.
3. Pick your Poison
This relates to how far a person is willing to go and what methodologies they're willing to employ. Realistically speaking, garden-variety SEO is more than adequate for certain levels or markets, but for others it won't necessarily make the grade. This is where individual value judgment and personal choices need to come into play - is there a taste for sweet dill pickles, so to speak, or wlll hot jalapenas be tasty? That will determine how competitive something is for that person or company in particular.
What's competitive for one may not be considered so for another. Unless someone is willing and capable of becoming fairly aggressive, it's wise to be realistic, regardless of what the unusual indicators show, and assess competitiveness based more than just empirical evidence and numerical factors.
KenEvoy
12-07-2004, 05:13 PM
Yes, excellent points, Marcia.
Your "timing" point is superb -- it's a great way to get the snowball rolling down the hill. It's exactly what we explain to our users. The really big/tough keywords are the last words that you'll rank highly for, not the first. Start at the fringes.
One thing... I doubt if we'd see eye to eye on SEO. At Site Build It!, we choose to leapfrog "advanced, nth degree SEO" and head straight towards reality. By giving the SEs enough on-page hooks to sink their teeth into, and by otherwise PREselling with excellent content and getting some key inbound links, human visitor behavior takes over and does indeed build the off-page criteria naturally and organically.
Overall, we show our users how to "engineer success" through superb content that hits the basic on-page criteria and that WOWS the human visitors. Ultimately, it is those delighted human visitors who generate all the off-page criteria (with a little help from the marketer of course, who must start the ball rolling by securing a few good inbound links). All of this to (hopefully) lead into a little controversy...
I find that heavy SEO emphasis is a little like chasing the Holy Grail. The engines get steadily more and more sophisticated at reaching the ultimate goal, which is simply to recognize reality the way humans do. SEOs have to chase this increasing sophistication constantly. Instead...
We choose to leapfrog the algorithm-chase and head straight to reality. And, all in all it's worked darn well for our tens of thousands of small business users. For example...
My own daughter started her anguilla-beaches.com site by eating away at the edges. She started when she was 14. She didn't rank anywhere in the Top 500 for her toughest word, anguilla. As she built more and more content, got more and more links in from caribbean and anguillian sites, as people loved her site more and more, they naturally deliver off-page criteria (we can only imagine what Google must track -- I doubt if more than 3 people have the COMPLETE picture, and their NDA is probably tighter than Coke's!).
The effect is like a boat with the tide coming in. Now she puts up a page about "anguilla wedding," is spidered and ranking in the top 10 within days. It's a slow, steady, tortoise-like process... but the tortoise wins in the end (she has averaged about an hour per week over a period of 2 years). And the only "work" I've done with Nori is on how to write more effectively for the human reader ("PREselling"), not SEO or anything like that.
At the opposite end of the spectrum, "geeks" like Marc Liron do the same thing, but in a bigger way as adults with more time and devotion and generate 15,000 pages per day at his updatexp.com site. Ask him what kind of advanced SEO he does and he'll just chuckle -- he abandoned those worries long ago.
Now, I know I'll get banged by a lot of SEO experts who love chasing the zillion variables down to the 4th decimal point or, worse, who still love "fooling" the engines. But you're over-engineering and doomed to chase the engines instead of delivering what humans want... and THAT actuallyl is what the engines want, too.
We just like to keep it real.
But I digress (although that should stir things up a bit ;-) ). I hope we've answered Dave's question about how SUPPLY *and* DEMAND are quite useful parameters. Marcia added some excellent points, and taken all together I think it points the user in some solid directions.
Ken Evoy
P.S. to Elisabeth --- sorry for putting my URL after my name. It was relevant in the context of our Search It! tool. I must admit I didn't read the TOS and never dreamed that one would not be allowed to put a URL WITHOUT any promotional slogan or anything else, after one's name. But I won't do it again. My apologies.
Elisabeth
12-07-2004, 05:22 PM
P.S. to Elisabeth --- sorry for putting my URL after my name. It was relevant in the context of our Search It! tool. I must admit I didn't read the TOS and never dreamed that one would not be allowed to put a URL WITHOUT any promotional slogan or anything else, after one's name. But I won't do it again. My apologies.
no problem - sorry i posted it publicly, but you had PM's turned off. It was relevant in context, so I have no problem with that or the other examples you just cited, it's just that we've opted to use Custom User Titles instead of linked sigs, so just needed to be consistent.
welcome to SEW, ken! excellent post, too btw.
I, Brian
12-07-2004, 06:28 PM
I constantly see people referring to Google's Results 1 - 10 of about x where x is Google estimate of the search term as an indication of how competitive a term is.
IMO, this is totally flawed as not only are Google's estimates way off, it also only tells us how popular a search term is, not how competitive it is.
Absolutely right - and one of those silly mis-conceptions in SEOs.
You can struggle to get a no.1 ranking for a specialist niche phrase, out of just a few hundred thousand results, if mostly up against universities - but walk into a top spot where the search term involves a few tens of millions pages.
In fact, just for the hell of it, I'm currently testing popularity vs competitiveness with a search term with around 250 million pages returned - just to see for myself how badly popularity factors into it compared to competition.
Popularity of a search term is pretty suggestive by the number of returns, but working out the actual competitiveness is something of a task - KEI, traffic logs, and SERPs required with some live testing.
St0n3y
12-07-2004, 07:35 PM
When determining competitiveness we look primarily at the sites already occupying the top positions for the keywords targeted. No one needs to know how many turtles are in the race, could be one or a hundred, it doesn't matter if you know you are faster. You want to know how many Hares you are competing against. Research the Hares and their weaknesses, then you can have a good idea of where your performance will be.
randfish
12-07-2004, 08:04 PM
Brian & Stoney -
Good points. That's why the tool (http://socengine.com/seo/tools/keyword-difficulty-tool.html) measures the top 10 competition for any given phrase and makes it a big part of the competitiveness of the score. The only tihng I can't currently measure is traffic logs, but the results are fairly revealing still. I could definitely use your help refining the percentage of the total score for each input.
P.S. Ken - Great example about your daughter and the site - it's amazing what a little content every day and a lot of time can do.
I'm curious how others judge the competitiveness of a search term?
I wish I could take credit for this method, but it really belongs to Dan Thies and his team:
1) Number of searches on all engines over past 60 days
2) % Relevance of search term to your site (i.e. 50% relevant, 90% relevant)
3) Estimated number of daily searches in GG, MSN and YH based on market share
4) Estimated number of click-thrus based on top 10 position in GG, MSN and YH
5) Number of competing sites for search term
6) Number of competing sites for search term with optimized titles
7) Number of competing sites with b/w links utilizing anchor text optimized for search term
8) Link popularity of competing sites
9) Current bids for search term on AdWords and Overture
AussieWebmaster
12-07-2004, 09:12 PM
I wish I could take credit for this method, but it really belongs to Dan Thies and his team:
1) Number of searches on all engines over past 60 days
2) % Relevance of search term to your site (i.e. 50% relevant, 90% relevant)
3) Estimated number of daily searches in GG, MSN and YH based on market share
4) Estimated number of click-thrus based on top 10 position in GG, MSN and YH
5) Number of competing sites for search term
6) Number of competing sites for search term with optimized titles
7) Number of competing sites with b/w links utilizing anchor text optimized for search term
8) Link popularity of competing sites
9) Current bids for search term on AdWords and Overture
This is pretty close to covering all the bases. I would also factor in the number of ads that have pretty much the same copy - though it can reflect a large group too lazy to resaerch... in most cases it is a large number all testing and finding the same answer on what gets the best CTR.
Dave Hawley
12-07-2004, 09:39 PM
WOW! Thanks all, this is some great reading all-around.
Just one point. There is a lot of talk about to find out how popular a search term is with searchers. For the discussion, I was asuming that this had been decided already, that is, which term to use is already known.
I normally, as other have said, I look at page 1 of the Google results for the search term I'm going to target and start pulling apart the top 10. IMO, these are the only ones I have to beat.
I normally, as other have said, I look at page 1 of the Google results for the search term I'm going to target and start pulling apart the top 10. IMO, these are the only ones I have to beat.I'd be spreading the love if I were you. IMO, to rely on results from a single engine is short-sighted and too risky if you have clients paying you for ROI.
randfish
12-07-2004, 10:52 PM
Dave - Agreed.
This tool is not for helping to select a term to target so much as discovering the 'difficulty' in ranking in the top 10 for the search term. As to the list above - it's funny you should mention it and Dan Thies, as he was just commenting on this tool over at HighRankings (he did not like it at all :( ).
I will be adding a suggestion of his - the number of allintitle results, as this directly pertains to how many pages are targeting the term directly.
Thanks for all your feedback and please contribute more ideas if you have them - I don't claim to be perfect by any means and your help will certainly influence the quality of the tool for the future.
Dave Hawley
12-07-2004, 11:14 PM
I'd be spreading the love if I were you. IMO, to rely on results from a single engine is short-sighted and too risky if you have clients paying you for ROI. Don't get me wrong. I also do the same for Yahoo and MSN, but as Google is the most popular, sends most traffic and this is a Google forum :) I tend to speak of only Google. BTW, I don't do SEO for $$, I help out others for free but mostly I help myself :)
On the point of Yahoo and MSN, I'm in their twilight zone at present. We use to have 2 domians (one parked on the other) we dropped one and Yahoo and MSN have not, as yet, gotten the one we kept. Although, MSN Beta has indexed most of our pages (wish they were live right now :( ). Yahoo on the other hand is simply not playing the game, we haven't seen it in our logs for over a year. Despite the fact we have over 1000 links from other sites pointing at us. Go figure :confused:
I will be adding a suggestion of his - the number of allintitle results, as this directly pertains to how many pages are targeting the term directly. Certainly should be included, however, keep in mind that there are many sites (for extremely competive terms/words) that may not even use the term/word in the Title. For example, check out the top 2 Google results for Computers.
Although lot's of good advise, I really still think the best way is to pull apart the top 10 for the specific term/word.
AussieWebmaster
12-07-2004, 11:48 PM
Although lot's of good advise, I really still think the best way is to pull apart the top 10 for the specific term/word.
That may help you find certain elements of what they are doing to get there, but there are more factors that they have no influence on such as the other sites on the other pages and even in the PPC areas.
There are terms with 20 people fiercely competing and others where there are hundreds...
DanThies
12-07-2004, 11:59 PM
randfish,
It's not that I don't like it, it's that I'm not sure if you're using the right metrics, and in general, I don't see how you can come up with one number to rule your strategy.
Here are a few links on the subject...
Discussion of keyword metrics on our site:
http://www.seoresearchlabs.com/keyword-metrics.php
http://www.seoresearchlabs.com/keyword-reports.php
Pinned topic on keyword competition at HR forums:
http://www.highrankings.com/forum/index.php?showtopic=3216
The main thing, in terms of 'organic' competition, is that you want to look at who is actually optimizing. Looking at intitle: and inanchor: combined (and the number of results returned for that combined search) will give you something like an upper bound on the number of pages seriously competing.
The approach I recommend is to use relevance (as Kalena mentioned) to develop a "weighted popularity" for the candidate search terms. This gives you an idea of which search terms will best reach your target audience.
Once you know what the targets are, there are 6 tiers of strategy:
SEO - Easy: search terms that can be targeted with on-page content alone
SEO - Medium: search terms that can be targeted with on-page content and internal anchor text
SEO - Hard: search terms that must be targeted with on-page content, internal links, and some portion of the site's external profile (inbound link text)
PPC - some terms will be targeted with PPC - if it's profitable you do it.
Content required - some search terms may appear "too competitive" or "too generic" for SEO, but visitors are searching for these things, and you must have content to satisfy visitors. May as well optimize it while you're there. :D
Modifiers - keywords that frequently appear with a search term. Added to your content, they multiply the traffic you will get from your SEO efforts. For example, words like "PHP, MySQL, bandwidth, transfer, 100MB, plesk, cpanel" that will be used by searchers alongside major search terms like "web hosting."
Dave Hawley
12-08-2004, 12:01 AM
That may help you find certain elements of what they are doing to get there, but there are more factors that they have no influence on such as the other sites on the other pages and even in the PPC areas Can you elborate on that somewhat? Why does it matter if the site has no influence on any element that has contibuted toward their SERP position? Also, what has PPC do to with organic SERP position?
DanThies
12-08-2004, 12:13 AM
I agree with you, Dave, that ultimately you have to look at the top ranking pages, and how they got there. But there's more than one way to the top, and what other sites have done does not necessarily dictate what you'll do. It does give you a very good idea of what the competition looks like though.
Dave Hawley
12-08-2004, 12:21 AM
But there's more than one way to the top, and what other sites have done does not necessarily dictate what you'll do That is very true.
Perhaps the best/simplest way is to assume whatever term chosen is ultra competitive and put in place all the usual suspects (links, anchor text, on page stuff etc) wait, then see what happens.
DanThies
12-08-2004, 12:27 AM
Perhaps the best/simplest way is to assume whatever term chosen is ultra competitive and put in place all the usual suspects (links, anchor text, on page stuff etc) wait, then see what happens.
That sounds like a very expensive approach! Search engine marketing, especially when it comes to SEO strategy, is about managing resources effectively. I certainly wouldn't want to shoot all of my "external profile" bullets at random.
AussieWebmaster
12-08-2004, 01:05 AM
Quote:
That may help you find certain elements of what they are doing to get there, but there are more factors that they have no influence on such as the other sites on the other pages and even in the PPC areas
Can you elborate on that somewhat? Why does it matter if the site has no influence on any element that has contibuted toward their SERP position? Also, what has PPC to do with organic SERP position?
The topic is competitiveness of a search term....
The fact that people strive for a position and get there does not in itself mean that it is competitive... even those that have 10 million pages, or the reverse the term that has 2000 pages can be more competitive...
If you are looking for competitive terms the number of people also advertising in PPC for the term would suggest the competition is at a premium...
there are many elements and the suggestions of metrics to determine them have been outlined above... you need to have a lot of factors to look at to make the determination.
randfish
12-08-2004, 02:38 AM
Dan and everyone -
I appreciate very much your help and suggestions. I'd like to make the tool useful above all else, as I will personally be using it, as well as sharing it with the SEO community.
I will add the allintitle results as a factor and try out the results for intitle + inanchor as well. I can't see any part of the equation that you're currently against having in there though... Maybe it's just the percentage of each component in the final score.
Dan, I know that one score shouldn't 'rule' a user's choicesm but a tool like this can certainly provide guidance, just as the Overture & Wordtracker tools do. I certainly cannot measure how pertinent or conversion prone a keyword phrase is in tool format, but I can extract as much information as possible to give users a good reference point and something to compare against. Again, the idea isn't to produce a single result, it's to have a relative measurement.
I'll look through the resources you listed, but I did read the forum thread at HighRankings and came away thinking you would approve of the tool (guess I need to read more carefullly :D )
In any case, your help and criticism is more than welcome. Please do not take my defense of the tool as anything but an attempt to gain greater understanding and provide better value.
Dave Hawley
12-08-2004, 02:41 AM
Dan, only if you assume I'm doing this for clients on paid basis. I just do it for my site, or one page of.
The fact that people strive for a position and get there does not in itself mean that it is competitive... No of course not. I'm not sure what that is relevant to though.
even those that have 10 million pages, or the reverse the term that has 2000 pages can be more competitive.Yes, this is what I said in my opening post.
you need to have a lot of factors to look at to make the determination.Yes.
Still if you can outdo the efforts of those who have top ten rankings (if they have 99 relevant anchor text links, you get 120 relevant anchor text links, if they use the keyphrase starting in the second position of the page title you use it in the first, etc etc) you will rank better than them.
There may be other ways to get there and perhaps there are better ways to get there, but I don't know of a surer way.
RandFish I like the concept of you keyword difficulty tool, and it is IMO a better concept than others I have seen, but it is not getting any of the factors anywhere near right.
Looking at data from one domain listed in your summary (and which I assume the difficulty results are based on):
Y backlinks Tool 1120 - Manual 478 (from Y manual search)
Y pages Tool 59 - Manual 20
G backlinks Tool - 2 - Manual 77
G pages Tool 120 - Manual 56
MSN pages Tool 431 - Manual 54
MSN links Tool 10 - Manual 79
DanThies
12-08-2004, 11:53 AM
I like the idea, randfish, but the execution needs some work. What I think would improve it:
less emphasis on PageRank as a factor
use title & anchor searches instead of total results
drop the 'months to rank' concept, it's fatally flawed
consider scoring the site that's trying to rank
I think the percentage is a bad way to represent difficulty. Something more like the Richter scale (orders of magnitude), or a logarithmic scale, would make more sense. So maybe you score from 1-10, where 1 = any term that has exactly 1 result; 10= outranking Google, Yahoo, Amazon for their name.
Two other random thoughts that may accidentally sail over a few heads:
1. The # of words doesn't say much, how common the words are might say something - "real estate" is not the same as "nigritude ultramarine." Reading Orion's posts (in this forum) on term frequency, c-indexes and E/F ratios may give you some ideas in this area.
2. Even a very simple vector space search tool can be applied in interesting ways. You don't have to reverse engineer a search engine, but looking at the top 10 pages, and seeing how relevant they are to the search term based on content alone, would give you an idea of how much "off page" factors were affecting search results.
Dan, I know that one score shouldn't 'rule' a user's choicesm but a tool like this can certainly provide guidance, just as the Overture & Wordtracker tools do. I certainly cannot measure how pertinent or conversion prone a keyword phrase is in tool format, but I can extract as much information as possible to give users a good reference point and something to compare against. Again, the idea isn't to produce a single result, it's to have a relative measurement.
I'll look through the resources you listed, but I did read the forum thread at HighRankings and came away thinking you would approve of the tool (guess I need to read more carefullly :D )
In any case, your help and criticism is more than welcome. Please do not take my defense of the tool as anything but an attempt to gain greater understanding and provide better value.
randfish
12-08-2004, 12:11 PM
Dan,
Thanks much for your input. I will be working to make the tool better based on your advice. I have been working with Orion to come up with an accurate, computationally affordable way to measure term weight & term vectors (see the end of the thread).
I will report back here when I've completed the modifications.
Mel,
The only figure that regularly scrapes inaccurately for me is Yahoo!'s backlink command (linkdomain:url.com -site:url.com). The results are usually 2-5% off the number I get through a manual search. However - I also get different numbers from Yahoo! on this search from my home and work computer...
I haven't seen any others that don't match the numbers correctly - the tool is scraping, not using API or third-party systems, so these numbers are what the engine reports at the time of the query.
Randfish if you are scraping the results that way then of course you not getting all the links, only the external links, which IMO may mean you are discarding valuable links.
If a site with 25,000 pages uses only a small portion of its pages to provide anchor text links to one of its pages, that may be why it is ranking well and is one of the factors you may have to overcome.
AussieWebmaster
12-08-2004, 02:01 PM
What about keyword density?
DanThies
12-08-2004, 02:41 PM
What about keyword density?
I'm not sure how you'd factor that in. If you map out keyword density for the top ten results on just about any search term, the values will be all over the place. When you map it out across a lot of SERPs, it's really a scattergraph, for all the major search engines. If you compare the graphs for results 1-10 with the graphs for results 91-100, they look the same.
I don't know why anyone even looks at keyword density any more - there is no magic number, because search engines don't look at keyword density. Keyword breadth (using variations and modifiers effectively) is far more important.
AussieWebmaster
12-08-2004, 03:08 PM
I'm not sure how you'd factor that in. If you map out keyword density for the top ten results on just about any search term, the values will be all over the place. When you map it out across a lot of SERPs, it's really a scattergraph, for all the major search engines. If you compare the graphs for results 1-10 with the graphs for results 91-100, they look the same.
I don't know why anyone even looks at keyword density any more - there is no magic number, because search engines don't look at keyword density. Keyword breadth (using variations and modifiers effectively) is far more important.
Good point
randfish
12-08-2004, 03:29 PM
I'm going to try to list out all the factors and how they might be measured here and if there is disagreement about whether a factor should/shouldn't be included, we can discuss it. (BTW - you all rock for helping out, thank you)
Analysis of Top 10 Pages/Sites Ranking for KW Phrase (I'll use Google for expediancy)
- # of results for search site:url.com @ Google
- # of results for search link:www.url.com @ Google
- # of results for search linkdomain:url.com -site:url.com @ Yahoo!
- # of results for search link:www.url.com/page.html @ Yahoo!
- # of results for search link:www.url.com @ MSN Beta
- PR of page
- PR of TLD (site)
Analysis of PPC & Search Popularity
- # of search last month according to Overture for kw phrase
- Top 3 bid amounts @ Overture for kw phrase
- # of advertisers for kw phrase @ Overture
Analysis of Direct Competition
- # of results for search allintitle:keyword phrase @ Google
- # of results for search intitle:term1 intitle:term2 inanchor:term1 incanchor:term2, etc. @ Google
- # of results for search "keyword phrase" @ Google - in quotes
Please give me feedback and let me know if there are more pieces to consider.
Other Issues:
Scoring the site that's trying to rank would be a good inclusion, but it might be better served by a seperate ranking tool - after all, in my opinion, the tool already takes too long to complete and these factors will make it take even longer. I'd like to do it, but I'm worried about usability of the tool as well.
As I mentioned before, keyword relevance will have to be left up to the user, but since this is more of a tool to measure the difficulty, I don't think it will have a negative impact by not offering this piece.
Also, I noticed that your (Dan's) analysis uses Alexa data for number of unique sites linking. I don't think I'll use Alexa data because of how inaccurate and slow to update it is. For a few dozen searches I ran, the tool showed extremely poor results, even if the only metric used was relative number of links. This does present the problem of not knowing how many 'unique' sites are linking - perhaps someone knows of another solution.
DanThies
12-08-2004, 03:45 PM
Numbers added for clarity in response --Dan
1. Scoring the site that's trying to rank would be a good inclusion, but it might be better served by a seperate ranking tool - after all, in my opinion, the tool already takes too long to complete and these factors will make it take even longer. I'd like to do it, but I'm worried about usability of the tool as well.
2. Also, I noticed that your (Dan's) analysis uses Alexa data for number of unique sites linking. I don't think I'll use Alexa data because of how inaccurate and slow to update it is. For a few dozen searches I ran, the tool showed extremely poor results, even if the only metric used was relative number of links. This does present the problem of not knowing how many 'unique' sites are linking - perhaps someone knows of another solution.
1. Scoring the site that's trying to rank would mean that you're looking at 11 sites instead of 10. At worst, a 10% increase in the effort required for your tool, but a very significant help in calculating competition or level of effort required to compete. If you owned Amazon.com, you could target just about any search term *today* with nothing but content and internal links, no?
2. I look at this data as indicating the level of effort required to become "a player" in a given keyword space, not so much for specific search terms, and you already know why we like the Alexa version.
If you want an alternative, you could look at subtracting "saturation" (# of pages indexed) at Yahoo from the domain's "link popularity" on Yahoo, to remove the internal links from the count. You'd have to make sure that you're only counting indexed URLs in your saturation number. This assumes that all internal pages link to the home page, of course, which is a safe assumption for 99% of sites.
randfish
12-08-2004, 04:20 PM
Excellent idea about using Yahoo! to remove internal links! Thanks!
Also, I see what you mean by measuring the site in question - it could definitely be an option if all we measure is the same pieces we measure for the top 10 sites. I will add it in.
I will get to work on this immediately and probably have a brand-new version ready for review on Monday or Tuesday.
Thanks so much for your help!
pdstein
12-08-2004, 04:55 PM
randfish, I too appreciate what you're trying to accomplish with this tool. After using WordTracker for the last few months I've come to the conclusion that KEI isn't really worth much. Just because millions of sites have certain keywords on them doesn't mean those sites are really competing for those keywords.
Take for example two different two-word phrases - trade stock and big world. Obviously trade stock is a much more competative phrase, but it has 18,200,000 matches in Google to 24,100,000 for big world. If you look at quote matches "big world" has 400,000 to about 100,000 for trade stock. So, IMO # of matches should have little if any weight.
Checking the overture bids is a brilliant idea. Sites paying the most for clicks are probably also going to be the ones paying the most for SEO.
One other thing I found is that for non-competative keywords free homepages can make it into the top 10 and when your tool measures site PR, site size, and BLs it measures those of the host which can severely distort the results.
The end result is that an extremely competative phrase like trade stock ended up with a score of 63 while a very non-competative phrase like big world got a score of 57.
Just my $0.02.
Thanks for your work on this. With some more tweaking, it could be an extremely useful tool.
GBR&D
12-08-2004, 05:27 PM
I couldn't agree more with DanThies in regards to Keyword Density, and while it may be possible to map trends amongst top ranking websites for a given term, it is no reliable indicator of ideal density ranges. The advent of Latent Semantic Indexing, Natural Language Ontology Mapping and other highly sophisticated text language disambiguation technologies, was the end of any effectual use of simplistic SEO strategies like Keyword Density tracking.
AussieWebmaster
12-08-2004, 05:45 PM
I couldn't agree more with DanThies in regards to Keyword Density, and while it may be possible to map trends amongst top ranking websites for a given term, it is no reliable indicator of ideal density ranges. The advent of Latent Semantic Indexing, Natural Language Ontology Mapping and other highly sophisticated text language disambiguation technologies, was the end of any effectual use of simplistic SEO strategies like Keyword Density tracking.
I was being very generic in the response but the elements of actual and similiar word presence on the page is a factor in the algorythm and you tend to find (though again not in all of them) the presence of the word and its synonyms in a decent number...
DanThies
12-08-2004, 06:08 PM
I was being very generic in the response but the elements of actual and similiar word presence on the page is a factor in the algorythm and you tend to find (though again not in all of them) the presence of the word and its synonyms in a decent number...
Yes, the presence and location of the search terms in the document is a factor in the vector space model, or any other likely method of searching the content of documents. However, I don't believe that you'll find a numeric representation of "keyword density" as a factor in any search engine's algorithm, even if you go back 5 years.
GBR&D
12-08-2004, 07:23 PM
Yes, the presence and location of the search terms in the document is a factor in the vector space model, or any other likely method of searching the content of documents. However, I don't believe that you'll find a numeric representation of "keyword density" as a factor in any search engine's algorithm, even if you go back 5 years.
Again I agree, and would like to state further that even a cursory study of Vector Space Architecture ( Common to IR Systems like modern search indexes) shows quite plainly the futility of common keyword optimization techniques. Efforts would be better directed by paying closer attention to factors like natural language, semantic relations, supportive ontology, etc. Additionally examination of Term Weighting conventions (even rudimentary fomulas such as Tf/Idf) is time well spent in regards to keyword optimization.
randfish
12-08-2004, 07:51 PM
Dan & Gang,
I wanted to mention that in order to 'normalize' or scale results as per your suggestions, I'm using an equation like
X = Y / Y^0.55
I used this because I found that Log functions were not providing enough distance between numbers. For those who may be new to the thread, the idea is that 100,000 searches last month does not make a keyword phrase 100 times more difficult to optimize for than a phrase that received 1000 searches last month. Using this type of scale, I can make the inputs 'fit the curve' much more accurately.
Let me know if I'm going in the right direction, or if there's a big flaw in using this logic.
pdstein - thanks much for your input. The gang here has been exceptionally helpful in trying to get the tool to give more accurate and honest results. Regarding free homepages in the top 10 - I think it is accurate to measure the power of the sites behind them, because Google does - that's why they're in the top 10. I have been competing against Craigslist postings for the last year, simply based on the power of the craigslist.org site...
Dave Hawley
12-08-2004, 07:56 PM
Yes, thanks all that have chipped into this thread. There is some good info in here.
DanThies
12-08-2004, 08:26 PM
Regarding free homepages in the top 10 - I think it is accurate to measure the power of the sites behind them, because Google does - that's why they're in the top 10. I have been competing against Craigslist postings for the last year, simply based on the power of the craigslist.org site...
Google does not look at Yahoo's link pop for Geocities sites. Google and all the other search engines look at web pages, not domains. You're confusing the toolbar PageRank display with the reality of PageRank, I think. There's a little link pop coming through with Geocities (not much) because there's actually a directory structure, but other free hosting sites aren't set up that way.
Craigslist is structured so that internal pages have some measure of authority, so you can expect to see a lot of those pages in search results. There's a chain of links into the content, across categories, and back to the home page.
I don't know if your math will work or not, because I don't know what kind of numbers you're playing with, but it sounds reasonable. The proof of your tool's utility will be seen in how well it measures the difficulty/competition for real search terms. :D
Dave Hawley
12-08-2004, 09:03 PM
Google and all the other search engines look at web pages, not domains For the purpose adding to their database yes, but I would think not only pages in the case of ranking.
orion
12-08-2004, 09:29 PM
Hi there.
This is a great thread.
I'm happy to see randfish and other dedicated developers working hard for improving current tools and developing new ones.
For those interested in term vector theory and tf*IDF schemes, the following threads may help.
Term Vector Theory and Keyword Weights (http://forums.searchenginewatch.com/showthread.php?t=489) thread
Keywords in Urls (http://forums.searchenginewatch.com/showthread.php?p=26514#post26514) thread
These threads show the futility of keyword density and similar concepts. As previous posters have mentioned, keyword density is not a good estimator of term weights and should be avoided. The main reason is that is contrary to term weight theory.
Orion
AussieWebmaster
12-09-2004, 12:05 AM
I think there is a slight divergence of information here... when I mentioned keyword density etc. what I was referring to were flags for well crafted pages that are part of competitive areas of search.
Apart from trying to break down the elements of what helps pages get better placement in the engines and numbers that may reflect some of the elements of what indicates competitive keywords, there are definitely elements that reflect efforts that may not contribute to Google's actual measurements but show efforts made to cover other engines and to work on providing information for the visitors.
Again I agree, and would like to state further that even a cursory study of Vector Space Architecture ( Common to IR Systems like modern search indexes) shows quite plainly the futility of common keyword optimization techniques. Efforts would be better directed by paying closer attention to factors like natural language, semantic relations, supportive ontology, etc. Additionally examination of Term Weighting conventions (even rudimentary fomulas such as Tf/Idf) is time well spent in regards to keyword optimization.
Reading posts like this make me understand why the eyes of relatives and friends sometimes glaze over at parties when I start talking about SEO :p . Just kidding. But given the depth of knowledge we have in this industry now, it's amazing to think that none of this stuff even existed 10 years ago. Great thread!
orion
12-09-2004, 10:54 AM
But given the depth of knowledge we have in this industry now, it's amazing to think that none of this stuff even existed 10 years ago. Great thread!
A bit of clarification here. Most of the IR knowledge seos/sems are getting now already existed 10 years ago and before. It so happens that only recently they have been reading and learning about these concepts. You would be surprise at how many W3C conference attendees think of seos/sems myths and speculations.
Examples
Keyword density was a myth created/perpetuated by some with vested interests. Its root can be traced back to readability theories. It does not come from IR. A lot of well-known seos promoted KD tools, not knowing about term vector and term weights.
Link citation-literature citation analogy. Another myth debunked by dedicated researchers. Unfortunately some IR scientists are still trying to defend this fallacy, but is matter of time for this fallacy to go away.
Hyphenated queries. Unlike other delimiters, hyphens affect the way text is parsed (Check Dr. D. Grossman readings dated back to the mid 90's).
Co-Occurrence theory has always been present in IR/SE algorithms, since the 70's.
etc...
It so happen that now seos/sems are getting smarters and less naive about so many myths and speculations running around.
Bottom line, take with suspicious any formulae/framework created out of thin air, with no scientific base.
Orion
pdstein
12-09-2004, 11:13 AM
pdstein - thanks much for your input. The gang here has been exceptionally helpful in trying to get the tool to give more accurate and honest results. Regarding free homepages in the top 10 - I think it is accurate to measure the power of the sites behind them, because Google does - that's why they're in the top 10.
I respectfully disagree. Free homepages don't get weight because of the weight of their host. There are lots of reasons that free homepages can end up in the top 10, including because the keyword phrase is non-competative.
Take two randomly selected words, combine them into a 2-word keyword phrase, and run your tool on them. For example, "everlasting mishmash"
Number of Words in Phrase 9% 85/100 7.65
Times Searched Last Month 12% 0/100 0
# of Results for Search @ Google 7% 37/100 2.59
# of Results for Search @ Google in "Quotes" 7% 1/100 0.07
Top Bid @ Overture: 7% 0/100 0
2nd Bid @ Overture: 5% 0/100 0
3rd Bid @ Overture: 4% 0/100 0
Strength of Competitors' Backlinks: 20% 100/100 20
Strength of Competitors' Pages PR: 9% 19/100 1.71
Strength of Competitors' Site's PR: 13% 55/100 7.15
Strength of Competitors' Size: 7% 86/100 6.02
Total Score: 45.19%
It gets a "moderate" difficulty rating because 40% of the score comes from factors derived from the site - site backlinks, site PR, and site size. But it is a completely non-competative term - not even a single quote match in Google - so chances are anyone could whip up a page with "everlasting mishmash" in the title tag and make the top 10 as soon as the page is indexed.
Personally, I would eliminate site PR and site size from the equation entirely.
I'm not sure how to deal with free pages on hosts that have millions of backlinks. Maybe you could compare site PR and page PR and give some exponential weighting to the backlinks. So, if the site has 100,000,000 and a PR of 8, the homepage would get credit for all the links, but a PR4 page would get credit for 10,000, a PR3 would get 1000, PR2 - 100. Just a thought.
randfish
12-09-2004, 01:01 PM
pdstein - We all certainly recognize the flaws in the current tool, but I appreciate your input.
Rather than doing away with PR entirely (which I think would hurt the accuracy of the tool), it might be better to simply include other factors, like the number of internal links from the site to the page ranking - that way, even if the page was newer and had no PR, it would still be recognized as being linked to by 500 or 5000 pages of a PR8 site...
The search would be at Yahoo! - link:http://www.url.com/page.html site:url.com
I'd say that the overweighting problem for free homepages that rank in the top 10 would be helped considerably by the addition of a few of these factors:
% of TLDs in top 10
# of Internal Links to Ranking Page
# of External Links to Ranking Page (link:http://www.url.com/page.html -site:url.com)
I'll see how it performs with these additions and ask for your input again. Thanks for offering such great advice and critiquing so carefully, I think this will be a great tool when it's finished.
This is what I'm thinking for percentages of the new tool:
15% - Times Searched Last Month
4% - # of Results for Search @ Google in "Quotes"
6% - # of Results for AllinTitle Search @ Google
8% - # of Results for Intitle Inanchor Search @ Google
9% - Top Bid @ Overture:
8% - 2nd Bid @ Overture:
7% - 3rd Bid @ Overture:
7% - Strength of Competitors' Site's Backlinks:
5% - Strength of Competitors' Internal Links to Page:
4% - Strength of Competitors' External Links to Page:
5% - Strength of Competitors' Pages PR:
8% - Strength of Competitors' Site's PR:
5% - Strength of Competitors' Size:
9% - Percent of TLDs in Top 10 Results
Dave Hawley
12-09-2004, 08:00 PM
There are lots of reasons that free homepages can end up in the top 10, including because the keyword phrase is non-competativeIMO, free home pages are treated no differenly than any other type. That is, they are where they are, in the SERP's, due to relevancy, or lack of.
pdstein
12-10-2004, 09:09 AM
IMO, free home pages are treated no differenly than any other type. That is, they are where they are, in the SERP's, due to relevancy, or lack of.
I agree. But if a free homepage is in the top 10 it probably means that a search term is non-competative and not that the site is scoring big points because on big host. If a person doesn't care enough about their site to spend $10/yr on a domain name, chances are they haven't optimized their site for search engines.
DanThies
12-10-2004, 10:57 AM
I agree. But if a free homepage is in the top 10 it probably means that a search term is non-competative and not that the site is scoring big points because on big host. If a person doesn't care enough about their site to spend $10/yr on a domain name, chances are they haven't optimized their site for search engines.
"Big host points" aren't going to be a factor on any search engine. There are almost 2.9 million URLs from geocities.com indexed by Google - some of them are going to be relevant for some searches. There is no "penalty" for free hosting, or "bonus" for having your site on a big server farm.
Geocities pages do better than your average free hosting because there's a topical directory of Geocities sites on their main site. There are also a lot of Geocities sites that have been online for many years and have good linking relationships. Because this is one of the oldest and most popular free hosting services, Geocities' main site itself has been around for a long time, and is very well linked up across the web.