PDA

View Full Version : how to lead spiders to orphan pages?


rdthoms
06-14-2005, 11:36 PM
I'm designing a website that in it's simplest form looks like this:

www.company.com/main.php
www.company.com/1/index.html
www.company.com/2/index.html
...
www.company.com/10000/index.html

There are NO links to 1, 2, etc. pages on main.php. I guess that would make them "orphan" pages.

Instead main.php has a form that asks the user for a zipcode and then (via database lookup, etc) directs them to the appropriate 1, 2, etc page(s).

So the problem is that I'm afraid a spider would never find the 1,2,3 etc. pages (since it would never submit a form). The 1/index.html page looks like static content so I'm not worried about what the spider will do when it gets there I just want to make sure it can find it.

I've been reading on this forum (great place by the way) and as I understand it the only two choices I have are:

1) create www.company.com/sitemap.html (containing links to ALL the static pages) with a link to sitemap.html on main.php.

or

2) create spiderbites.company.com/index.html (containing links to ALL the static pages) and point search engine spiders to this location

Option #1 seems like the logical thing to do but I have two issues with it. First I don't necessiarly want to post a sitemap early on that only shows a few of these 1/2/3 pages (makes the site look too small). Secondly when the site gets huge (cross your fingers) won't having 10,000 links on the sitemap be too much?

Option #2 somewhat "hides" the sitemap from direct view by humans but seems like it could be considered SE spam.

Which option do you consider the best?
Are there any other options?

Thanks,
Richard

seomike
06-15-2005, 10:30 AM
Obviously a page with 10,000 spiders will probably not even load. I've experimented with large link structures and found that if you chain the pages together so on every page there are 3 links a link that goes to the next page the previous page and the home page.

Basically like this if you were on page 2

<< Page 1 | Home | Page 3 >>

Is there a way to group these pages? Sounds like your site is viral (in the sense that it will grow and grow) if this is the case why not do an archive by month. That way you can do a site map but break it up so that it's not a 10,000 link browser crasher.

rdthoms
06-15-2005, 02:41 PM
Thanks for the advice.

Here is the idea behind the website.

Folks pay $x to have thier own custom page. I would like those pages to be indexed but I don't want someone to see how many pages exist because at the beginning there may only be 10 or 20 and new subscribers would think the service is too small to consider. Those pages hopefully stay active forever and hopefully 10 becomes 100 becomes 1000 ... I move to the beach and so on. Once there are 1000+ pages it then becomes advantagous for the potential subscriber to see how BIG the site is. Then I can do an intellegent grouping in a sitemap and show it off to people and spiders (while trying to avoid choking the spiders).

In the meantime I want the pages to be indexed so search engines will find some page on the site (any page just to get users to land into the site). I figure the more pages that get indexed the more chance that people will happen upon the site and then consider subscribing themselves.... the viral idea you talked about.

Your idea of the three short links is good. There's no reason a person would want to see the "next" page because it may be from a totally unrelated user etc. but as long as the spiders follow it it should work. Then I only have to link a few "sample pages" on the sitemap and the spiders take over and do the crawling.

Thanks for the tip.

Any other ideas out there?

-Richard

Scottie
06-15-2005, 05:49 PM
If you have another site indexed, maybe it would make sense to link to those pages from it and not the primary site?

In other words, a parent company site that already exists and proudly refers to the build-your-own-page site and includes a sitemap to the existing pages there.

seomike
06-15-2005, 08:19 PM
What are these pages used for? rss? dedicated linking?

rdthoms
06-16-2005, 01:32 AM
Scottie,

I think your idea is the same as option #2 in the original post.

SeoMike,

The concept is kinda like a directory listing. Theoretically it could be a one-page site with 10,000 listings buried in the database searchable by say a zipcode. But these pages would never be seen by a spider because it would never submit the search form. I knew that if there were 10,000 listings then it would be best to make it look more like a 10,000 page site (figuring search engines would have a decent chance of finding 1 of the 10,000 pages to get people into the site)

I did not learn about mod_rewrite until I found this forum so instead I have a directory structure like 1/index.php 2/index.php 3/index.php. Each of these index.php's are exactly the same (just links to a single source file). The index.php senses it's directory (1,2,3, etc) and uses that to query the database and then builds the dynamic page with the specific content.

I definetly like the idea that my users can tell someone to visit company.com/1 to see their webpage. I'm going to investigate the mod_rewrite to see if that is a better system as it should be able to do the same thing.

Although my site has nothing to do with vacation rentals I got the site idea from sites like www.vrbo.com, www.cyberrentals.com but there it is a classic link structure. Nice if you have 30,000 pages but looks pretty pathetic at startup when you have < 100 hence the desire to "hide" the sitemap. But there should be decent content on these 100 pages so I would really like to get them into the indexes.

Somewhere else on this forum I'm going to start a thread about how to bootstrap a site like this. Think of the vacation sites I mentioned above. How do you get property owners to list if nobody knows about the site? How do you get people to find the site if there are not many listings? Chicken-and-egg problem!

PhilC
06-26-2005, 06:37 PM
The chicken and egg situation for that type of site can be sorted by adding a stack of some sort of free listings, but that's another topic.

The idea of laying trails for the spiders to follow is excellent - it's how I've had many tens of thousands of pages that are normally behind forms on different sites indexed by Google.

Have you thought of creating an actual topic-based directory within the site? Many people use that kind of directory these days just for their link exchange section. You could even populate it with all sorts of additional stuff so that it doesn't look too empty.