PDA

View Full Version : Site still in supplemental results - sitemap to blame?


Jupiter
09-09-2005, 06:18 AM
A little over a month ago we submitted an optimised site to Google sitemaps. The site is built with OScommerce and each database page has automatically generated tag fields, there are proper tags in the static pages as well and selected keywords are targeted.

The sitemap we submitted was automatically generated by a utility and did not inlude the static pages. We figured that should not be much of a problem since Google would find those from the links anyway.

One month on the site is doing very well in MSN, average in Yahoo and is nowhere to be found in Google. I just checked the site:... in Google and I can see that for the domain http://siteurl.co.uk (or just siteurl.co.uk), Google can see 1,220 pages (site has a max of 250), but all of them are in the supplemental results. A search for site:http://www.siteurl.co.uk returns just the home page of the site, again in the supplemental results.

I have the following questions to ask:

1. Could the fact that the site works for both siteurl.co.uk and www.siteurl.co.uk have anything to do with this?

2. Could the site be in the sandbox (considering that it is relatively new)

3. Should we create a sitemap to include the static pages of the site or does that not matter much?

4. Should/could we create another sitemap showing the links as www.siteurl.co.uk instead of the current which show the links as siteurl.co.uk? We thought that didn't matter since the url is essentially the same, but apparrently Google sees the two urls as two different sites. Will that change when the site is moved from the supplemental index to the main index? Is that hurting us at the moment?

Any help would be greatly appreciated.

Rob
09-09-2005, 12:07 PM
If it is a new site and/or new domain it likely is the "sandbox."

Therefore what you can do is either wait, or build a few good quality links from directory sites like the Yahoo! or ODP directories. You will also want to get on some link building - finding relevant sites (ie similar market or geographic area) and request links from them.

seomike
09-09-2005, 12:18 PM
A supplimental page just means once upon a time there was a link to the page, now there isn't. Supplimental = Google's archive of an old page on your site.


Check your links to make sure they didnt' change ;)

Marcia
09-09-2005, 03:21 PM
The site is built with OScommerce and each database page has automatically generated tag fields, there are proper tags in the static pages as well and selected keywords are targeted. How much "unique" content is there on each page aside from what's in the global navigation?

Could the fact that the site works for both siteurl.co.uk and www.siteurl.co.uk have anything to do with this?It shouldn't work for both. Decide on which one to use and use a 301 redirect to that one from the other - sitewide, use only one version, with or without the www.

PhilC
09-09-2005, 03:52 PM
The fact that the same pages come up with and without the "www." won't cause any ill effects as far as penalties are concerned. The 2 versions could be different sites and they have to be treated as different sites by the engines, at least until they can ascertain that they are the same site. Having said, Google does seem to sometimes recognise that they are the same site for some things but not for others.

You can add the static pages to your Sitemap file. Google downloads Sitemap files at least twice a day.

If a site's pages are Supplemental, they aren't in the sandbox, as far as I know.

Supplemental pages are not what was described. Pages are put into the Supplemental index for various reasons, one of which is that the pages contain nothing of value. They are basically pages that Google doesn't want showing up in the serps, unless they have little else to show for a particular search.

If all of a site's pages are in the Supplemental index you can pretty much wave goodbye to the site as far as Google is concerned. I saw somebody post that his pages (or some of his pages) came out of the Supplemental index following the submission of a Sitemap file that included the URLs, but I've only seen that said once, and it may have been coincidence. Pages don't often come out of the Supplemental index.

Rob
09-09-2005, 05:16 PM
I've actually had pages come out of supplemental.

They went in when they were new pages and came out over time.

AussieWebmaster
09-09-2005, 06:46 PM
How much "unique" content is there on each page aside from what's in the global navigation?

It shouldn't work for both. Decide on which one to use and use a 301 redirect to that one from the other - sitewide, use only one version, with or without the www.

Marcia, Marcia, Marcia... beat me to that reply!!!

Marcia
09-09-2005, 06:50 PM
It really has nothing to do with pages being new

Pages are put into the Supplemental index for various reasons, one of which is that the pages contain nothing of value. They are basically pages that Google doesn't want showing up in the serps, unless they have little else to show for a particular search.Exactly, nicely described. I had about 6 pages on a site several years old go either Supplemental or URL_only - and just managed to pull them out which was kind of an interesting experiment - I sort of didn't expect them to recover.

A couple were seasonal pages from last Fall that just didn't get moved when the site was moved, so they were 404's - but still linked from within the site. Another couple were just obscure forgotten pages with sloppy webmastering, one had just a product with a link without site navigation. In short, they were either missing or crap pages. Since it's a small site and it was only a few pages, it was fairly easy to figure out what the problems were.

I uploaded freshened up seasonal pages (two of them) and doctored up the others to add new, fresh content and value, linked from a site map (just an old-fashioned one on-site, not done through Google) - and all is fine with them now, they're getting some targeted traffic and even some sales.

From a lot of what I've seen, pages that can be seen as duplicates or near-duplicates that have no unique value can go Supplemental, and it appears that most never do come out.

As far as www and non-www goes, there's a site in some search that I watch that's had a mixture of links to both from all over the site, even using both versions in their navigation on the same pages - very sloppy webmastering, with dups all over the place. The vast majority of that big site is in the Supplemental Index and not likely to come out - it shouldn't, it's enough to confuse the heck out of bots.

It is definitely not a good idea to have a site accessible both ways and have all or most of a site's pages have duplicate URLs, not when it's so simple to do a 301 and simply be consistent in linking.

AussieWebmaster
09-09-2005, 06:52 PM
The supplemental pages supposedly only appear in the SERPs when other pages are not available... could never quite figure how they did that but I suppose really tight searches need a result...

PhilC
09-09-2005, 08:40 PM
That's my understanding too. When they can't produce a reasonably sized results set, they dip into the Supplemental index to augment what they have.

Doing it is straight forward though. When they receive a search query, they try to get an acceptable results set (about 40,000 results) from the index that contains the words in link text and page Titles. If they can't get a big enough results set from there, they add to what they have from the index that contains all words. If they still can't get a reasonable number of results, presumably they do the same - add to them from the Supplemental index.

That implies a completely seperate index of words, document IDs, and so on, but it may not be so. It may be that the words from Supplemental pages are contained in the first 2 indexes, and flagged as Supplemental.

Marcia
09-10-2005, 12:20 AM
Basically, I believe the Supplemental is for pages that aren't considered of enough value to be worth expending crawl resources on with frequent crawls, which is understandable. Pages I'm seeing in the Supplemental now have cache dates dating back to January & February of this year so that's kind of borne out evidentially.

I do have one single page on a new site that's up just under 3 months that I just noticed is Supplemental. Sure, they obviously know about the page and the content is unique, but there isn't one single link to that page yet from another page on the site or anywhere, including the site navigation. I'm surprised it's been indexed at all, but my guess is that it'll get newly crawled and move into the regular index when I update the navigation and include it in the links.

But that's a small site with individually hand rolled pages, not the type that generally runs into duplicate content problems. IMHO the problem with template and a lot of ecom site pages is probably that they're too close to being replicas of each other, with little that's unique enough to warrant frequent crawling.

PhilC
09-10-2005, 09:20 AM
I have a similar page Marcia. It's a site that I started to make many months ago and forgot about. It's home page isn't linked to from anywhere except from itself (Y! shows that), and nobody knew about it except me. There are no other pages in the site and somehow the home page ranks in the 30s in Yahoo! for a halfway decent searchterm, and Google also has it in Supplemental. I've no idea how either engine came by it. One can't have got it from the other, and I don't believe that Google uses the Toolbar to acquire URLs, although I may be wrong about that.

The fact that it's an orphan is probably the reason that it's Supplemental.

martinuboo
09-10-2005, 10:31 AM
Google says in their Webmaster FAQs:
Why is my site labeled "Supplemental"?

Supplemental sites are part of Google's auxiliary index. We're able to place fewer restraints on sites that we crawl for this supplemental index than we do on sites that are crawled for our main index. For example, the number of parameters in a URL might exclude a site from being crawled for inclusion in our main index; however, it could still be crawled and added to our supplemental index.

The index in which a site is included is completely automated; there's no way for you to select or change the index in which your site appears. Please be assured that the index in which a site is included does not affect its PageRank.

I have seen pages in the Supplemental Index for various reasons (I think I know what most of them are). Marcia and Phil, I think you have touched on many of the reasons. What I would really like to hear about is success stories of pages resurrected from "Supplemental Hell"!

I also read the post about G's Sitemap rescuing pages, but that is all I have heard on that. I have a site that is well ranked and everything is fine, except for some pages, the only ones that use ?id=xxxxxxx for very different versions (totally unique other than basic structure). All, but one, of the ?id= pages are in the Supplemental Index.

Google has said they don't like &id=, but I am wondering if ?id= might not be liked either.

I am going to try the G Sitemap and see if they can be rescued. Otherwise, the only fix I have heard of that works, is to 404 the existing pages and then create new ones with friendly page names.

Thoughts? Ideas?

Thanks.

martin

PS. Sorry if this too much off the original topic, but I think it's relevant

PhilC
09-10-2005, 11:24 AM
It doesn't matter if the id= is preceded by a ? or & as long as it is a parameter in the URL. Also, I believe that they are ok with id= as long as it isn't followed by too big a number. It's anything that could be a session ID that they try to avoid.

Jupiter
09-12-2005, 08:16 AM
How much "unique" content is there on each page aside from what's in the global navigation?

Not very much since this is a commercial site. A brief product description and a product photo, much like every other oscommerce site out there, but since the site is selling clothes and accessories the product descriptions are no more than a line long (e.g. Black skirt with white stripe with side zip opening, length 80cm).

Needless to say that each page has a unique title and description tag - albeit very similar: title=buy keyword at bla bla bla, description=buy keyword at bla bla bla. We also sell bla bla bla - and a set of global keywords that includes the product of each page. All these are automatically generated and place the name of the product in predefined places, but the whole thing makes perfect sense in the end (i.e. buy keyword at....). Could that be the problem?

It shouldn't work for both. Decide on which one to use and use a 301 redirect to that one from the other - sitewide, use only one version, with or without the www.

I will do, sitewide only one version is used (without www), consistently and without exceptions.

I have submitted the site to several directories, and it has already been listed in a few. However, I submitted the www url. Will keeping the plain url (without the www), and adding a 301 redirect to the www url render those links useless, or will they still count?

Most pages have a few variables in the url (example from a product page: http://siteurl.co.uk/product_info.php?cPath=33&products_id=254). Could that be the problem? It shouldn't according to other posts in this and other forums.

Thanks for the help