View Full Version : searched keyword missing in the search result
kunalg
03-11-2005, 08:31 AM
hi buddies
Today i notice an important thing in google.i search on google for "stjohn airports" and in the first 10 result displayed is
http://www.spirit-of-canada.com/canada/ctp_AtoZ.html
Now when i see google cache it shows the resort word highlighted and displayed that word "stjohn"
"These terms only appear in links pointing to this page: stjohn"
Now when i checked the links pointing to this page i can't find "stjohn" keyword anywhere in the links to this page. can anyone guide me why these result displayed in the top if this page and anchor text pointing to this page dont have "stjohn" word.
Also the description shown below the result also have stjohn keyword. from where this word(stjohn)shown up in description.
can anybody notice the same and help me to solve this mystery.
PhilC
03-12-2005, 04:12 PM
If Google says the word is in links pointing to the page, then it is, or at least it was when Google spidered them. Google only shows a sample of the links that it knows about, and it may not show the ones that contain that word.
Google is discarding the apostrophe, and counting words with and without it as being the same.
The text in the snippet (description) does appear in the page and in the cached page.
>If Google says the word is in links pointing to the page, then it is, or at least it was when Google spidered them
That isn't the case, for example if the word was in the page title or noframes content you still get the same message.
PhilC
03-12-2005, 05:04 PM
Really? I never knew that. Are you sure?
>Are you sure?
As sure as I ever am. You can include url and description in there too. I'm not saying that in these cases that the word isn't in the inbounds just that it isn't *only* in the inbounds, if that makes sense.
<added>Its just a generic mesage that means the term wasn't found in the body text if you like, as in http://66.102.9.104/search?q=cache:JoGVAw8oOPIJ:www.ebay.com/+phil_c
PhilC
03-12-2005, 06:06 PM
I never knew that - it's interesting
glengara
03-12-2005, 06:30 PM
I tried that link, substituting ever more outlandish terms and kept getting the same message, might it be an Ebay thing?
Nacho
03-12-2005, 07:14 PM
In this cached page (http://216.239.63.104/search?q=cache:tkFya8fC8TYJ:www.spirit-of-canada.com/canada/ctp_AtoZ.html+stjohn+airports&hl=en) I do see the word stjohn in a link 1/3 down the page for stjohnsairport.com. Keep in mind that a query in the default FINDALL (also known as the AND mode) means that your queried words "stjohn airports" must appear in any part of the document regardless of proximity and order.
Say the words don't appear in the cached page you're looking at. I also notice that this document was cut off at 100 KB (102,764 bytes). We once discussed this limit size (http://forums.searchenginewatch.com/showthread.php?t=3969) here at the forums, but we never talked about the the actual content of the document which might be the case for you. An interesting question would be, does Google actually index the entire document regardless of size but only display in the cache the first 100K? I believe it to be so. If this is the case, then what could be likely your case is that the words "stjohn" might be after the 100K mark wich is unseen to you on the cache but it was actually picked up in the index for evaluation at the time processed.
kunalg
03-13-2005, 06:09 AM
hi buddies this might be right that google not consider title and description in the onpage factor. that is why he is displaying that the "stjohn" appear in the link pointing to this page although it is appearing on the page and regarding point that google will index page only upto 100k is absolutely true as i have test this on my pc and explain to all if anyone need it. but the term stjohn which is appearing in the page in(stjohnsairport.com) is in the first 100k and it shows on the page and in the cached text of the google.
Then why google shows that the term appear in the link pointing to this page.
Please guide me ??? whether i am going right or not.
Michael Martinez
03-17-2005, 02:47 AM
Okay, here is the search:
http://www.google.com/search?hl=en&q=stjohn+airports
As I understand it, you want to know why Google says "stjohn" only appears in a link to this site (currently ranking 14th, rather than in the top 10):
http://www.spirit-of-canada.com/canada/ctp_AtoZ.html
Now, you have apparently been asking that question in quite a few forums, at least since March 11. So, I guess you're just cutting and pasting the same query and posting it in forum after forum.
Someone suggested it may be a stemming issue. You replied that two separate searches produce different results.
Here, Nacho pointed out a page with a link in the cache, explaining that the default search mode (FIND ALL) doesn't require the words to be together.
If you use the EXACT FIND search, you don't get that listing:
http://www.google.com/search?q=%2B%22stjohn+airports%22&hl=en&lr=&c2coff=1&safe=off&filter=0
In effect, there are pages which link to the spirit-of-canada page you are questioning, where the "stjohn" occurs in links but not necessarily in links to that page.
If you want to know WHY Google makes this kind of chained association, only someone from Google can really explain it to you.
It appears to me that Google is using an equivalence function, identifying "St. John's" as equivalent to "stjohn" or "stjohns", and that by doing so, it broadens the results pool considerably to sites which share a common theme (travel to St. Johns by air).
Marcia
03-17-2005, 04:42 AM
I've seen that "only appears in links" when there weren't any links; it was just in the page title. NFFC has it.
What was it that Mel posted the other day about Google fetching anchor text and page titles first?
You can find a reference to that in the "Anatomy" paper (http://www-db.stanford.edu/~backrub/google.html) where you will find this little gem:
...We chose a compromise between these options, keeping two sets of inverted barrels -- one set for hit lists which include title or anchor hits and another set for all hit lists. This way, we check the first set of barrels first and if there are not enough matches within those barrels we check the larger ones...
Which has important implications (assuming that Google is still doing things this way, which IMO is likely) in that competitive words will likely have enough hits in the barrels to satisfy the size of the ranking pool (originally set at 40,000 hits) and thus pages which do not have a competitive term occuring in either anchor text or page title don't stand any chance of ranking at all.