PDA

View Full Version : Indexing problems and supplemental index


Mike
03-11-2005, 11:53 AM
Hi all,

one of my coworkers has a site that they redesigned about 5 months ago. As of now, the site has over 300 pages indexed in google. However, all of the pages show up as supplemental. Also, the homepage has no cache at all.

It's been like this for at least 3 months now. Google visits and spiders the site regularly, but why the supplemental index business, and why would it not cache the homepage?

There's no fishy meta tag business going on, nor a robots.txt problem.. the client doesn't have much in the way of google rankings.

They have a 0 PR (not greyed out, just 0). Google reports 8 backlinks. Yahoo reports 122 backlinks, via link:http://www.site.com -site:www.site.com

Not sure what this clients' status was before signing on with my company 6 or 8 months ago..

Anyone encountered this type of thing before?

Michael Martinez
03-11-2005, 03:48 PM
Anyone encountered this type of thing before?

Lots of times, for different reasons. Without being able to look at the site, I cannot offer you any helpful opinions or suggestions. But I can shotgun some stuff at you. It may all hit wide from the mark.

Generally speaking, supplemental results mean Google is clustering content. You can look for obscure, relatively unique phrases on various secondary pages and see what comes up first. That might tell you if the main page is overshadowing all the secondary pages.

You can also look to see that the title tags for each page are unique, and that they match on-page text (preferably in H1 header tags, but enlarged font selections work nearly as well).

Maybe they are puttting too many keywords into their title tags. That is one of the most common mistakes people make. A title tag should have 1 or 2 short but meaningful phrases.

I normally recommend something like:

"my keyword phrase | my company name"

If you want to get fancy, you can try:

"my keyword phrase | my company name and my keyword phrase"

But if the phrase is more than a few words in length, you don't really want to do that.

If the client has targeted 10 phrases and put them all in one title tag, that could be the problem.

If the client has hidden links, hidden text, very little text, or no text whatsoever, that could be the problem.

If the client is using Javascript or images to embed text on the page, none of that content is indexable.

If the client is using Flash, that content may be indexable, it may not. Flash is still an iffy thing at best.

If the client is using frames, some people have reported problems with using invisible frames (no dimensions).

If the client is using iframes, iframes don't get crawled (at least, I have never seen that happen, and I use iframes on some pages).

If the client has broken HTML code (bad table elements, for example), the indexing program might be choking.

Those are just a few things to look for.

Marcia
03-11-2005, 04:06 PM
Is it a data driven templated site, with a lot of pages close in content? I"ve seen a some pages go supplemental (or URL only) for being duplicates or near duplicates.

Mel
03-12-2005, 02:34 AM
You may find this thread (http://forums.searchenginewatch.com/showthread.php?t=3919) discussion on the supplemental index interesting

PhilC
03-12-2005, 10:38 PM
That URL ain't working, Mel. You sure you got it right?

Mel
03-12-2005, 11:04 PM
There is an extra http:// in the URL but I can't edit the post so here it is again (http://forums.searchenginewatch.com/showthread.php?t=3919)

PhilC
03-12-2005, 11:16 PM
I should have spotted that - thanks :)

Mike
03-14-2005, 02:13 PM
Thanks for the replies everyone,

I did some further checking today and the dates on all the supplemental index pages are sep 2004.... I was under the impression that google was visiting the site regularly but not taking it out of the supplemental index.

speaking of this, these pages all have descriptions and dated cached pages. still strange that the only page without this is the homepage.

regarding the content issue, the pages are not too close.. they have the same nav of course, but the rest of the site is unique for each product.

title tags and headings are all good.. it's really a tough one to solve.

pleeker
03-14-2005, 08:32 PM
I did some further checking today and the dates on all the supplemental index pages are sep 2004.... I was under the impression that google was visiting the site regularly but not taking it out of the supplemental index.Can you get some new inbounds to the site to encourage G to revisit again? And new inbounds to pages deeper than the home page would be preferred, since those haven't been touched in 6-7 months.

palms
03-14-2005, 09:20 PM
Not that it's any big deal, but I've seen a few exceptions to this statement:

<<Supp pages never rank on competitive terms.>>

I've seen an indented result at the number 2 position for a 9 million result $$$ term search.

It has:
No Title
No Description
No Cache

Very strange.

Bernard
03-15-2005, 11:28 AM
Mike, you really need to dig into the raw server logs and determine if Googlebot has been visiting or not. This happened to me (http://forums.searchenginewatch.com/showthread.php?t=4614) at the beginning of the month and was the fault of a router problem with my web host (just blocking Googlebot).

Mike
03-15-2005, 01:43 PM
thanks everyone, I have some more info:

the client's host has some stat tracking stuff, which is telling us that googlebot visited the site and downloaded 33 megs of stuff, 2,737 hits total, during the month of march.

all the pages are supplemental index, and dated around sep 27th 2004.

really doesn't make much sense..

we're going to have the client make a change to an html page as a test, and see how long it takes the cached version of the page to update (if at all).

if google is visiting as often as the host claims, it shouldn't take long. if it does change quickly, it doesn't really solve the problem but at least it tells us google is spidering the site regularly.. then I would say it looks like an inc link problem, and that we need to get better link popularity (however the site has links already -- granted not thousands, but still).

thanks again everyone. I enjoy puzzles like this -- except for when no solution seems to help!


edit: got the OK to post the URL if anyone wants to take a look: www.shopdress.com

Bernard
03-15-2005, 02:31 PM
I would still highly recommend digging into the raw log file to determine what pages GBot is accessing (your supplemental pages? or cached pages?) and if any errors are being returned.

Marcia
08-07-2005, 11:11 AM
Sorry to bump this so long afterward, but I just came across this thread and took a look at the site for the first time.

Added and corrected:

OK, I see the filenames linked from the homepage being "regular" pages but from what's showing cached the site still isn't being crawled. All those files in the supplemental are still on the site

http://www.google.com/search?hl=en&q=site%3Ashopdress.com

Seen clearer at Yahoo

http://search.yahoo.com/search?ei=UTF-8&p=site%3Ashopdress.com&xargs=0&pstart=1&fr=slv1-&b=81

It still seems like a problem from those pages.