View Full Version : Spider-friendly URL indexing
I develop dynamic pages (in .NET) that do not use query strings in the URL by nature of the way it's coded, and not by use of any URL rewrite apps.
The problem I'm wondering this might pose is that even though these URL's are spider-friendly (by not having query strings), would they confuse SE's because the dynamic page would be different every time it's indexed?
From my understanding, when you use a URL rewrite, it fakes a static URL that actually does pass in the query paramters and points to the right page.
But since I don't have any query paramters in my URLs in the first place, if a SE indexed a page from my site (say a product page in a shopping gallery), that url would always be 'shopping_gallery.aspx', though if you went to that page directly, it would just bring you to the storefront or beginning of the gallery.
Would I get penalized by SE's for this?
If so, does anyone know of some workarounds?
Thanks for your help, i'm finding these forums very useful!
Nick W
07-15-2004, 12:46 PM
Shouldnt think you'd get penalized in any way. But you wont get indexed properly.
It's like your using a framed site, the SE just sees one page right? It'll only index one page.
You need to re-think the way your site works. Individual pages are what SE's like ;)
Nick
rogerd
07-15-2004, 02:45 PM
Nick_W is right, JonS - you need unique URLs for your products and product groupings, if any. Otherwise, you'll get one page indexed, probably with the generic store welcome content.
Mikkel deMib Svendsen
07-19-2004, 08:13 AM
It sound like you use the GET method to access those products - if you do so, remember search engines do not submit forms. Another explanaitin might be that you are using session cookies. Try and surf your site with cookies off, if it dosen't work it won't work for spiders either.
You do need to be able to produce URLs that stay the same for every visit. How would users bookmark a page? How would search engines? :)
well, it's not exactly the get method but you're right-- I am essentially posting the form multiple times to get to a certain page. I do have a built in method to pass a single query parameter to get to a page so users could bookmark it, but that's only if they want to save that link (it's not fully built into the general navigation of the site yet).
Guess I'd just have to think of a compromise or a workaround that won't give me a headache ;)
rogerd
07-19-2004, 11:27 AM
Jon, one quick fix (if I understand what you are doing) would be to create a pseudo-navigation structure and/or site map with your single-parameter queries. That's not really optimal, but you really need to give the spiders unique and consistent URLs for products and product groups (if you have them).
seomike
07-19-2004, 04:37 PM
When engines come to the same page and it's different everytime wouldn't that trigger a cloak penatly? Or put a red flag up for someone to look at?
rogerd
07-19-2004, 04:49 PM
...wouldn't that trigger a cloak penatly? Or put a red flag up for someone to look at?
Some dynamic pages change each time they are spidered, ranging from minor changes like date/time to bigger changes (headlines, blog feeds, last 10 posts, etc.). I wouldn't expect a penalty or a flag for a manual check, I'd just expect the ever-changing page to be poorly ranked.
If one can provide the spider with a list of unique URLs, e.g., onepage.php?product=1234 (or, even better, /onepage/product/1234.htm), the spider shouldn't really perceive that there's a constantly changing single page - it should see each URL as a unique page that is the same (more or less) each time it is checked.
Mikkel deMib Svendsen
07-20-2004, 03:55 AM
When engines come to the same page and it's different everytime wouldn't that trigger a cloak penatly? Or put a red flag up for someone to look at?
The problem seems to be the oposite - not that the content change, but the URLs, so the URL spiders come back to is not there anymore. That is definately not good but it will never be detected as cloaking. However, with ever changing URLs it will be impossible to keep a steady indexing and completely impossible to build any kind of valuable linkpop.
Thanks for all the suggestions...
not to totally go in the opposite direction, here's another thing I was curious about: when you DO have query strings in your url, is there a rule-of-thumb on how many parameters are too much, and how long each should be at the most?
I've got a page where the url will often look like this:
StoreDetail.aspx?s=58&p=1&i=1672
From what I've read, some people would say that's too much, some would say i'm safe. It's definitely not an IDEAL url, but will it get penalized?
This could probably be a whole 'nother thread =)
Mikkel deMib Svendsen
07-20-2004, 11:14 AM
As a good rule of thumb:
1 parameter
We usually do not see any problems with indexing
2 parameters
With two parameters it's about 50/50
3 or more parameter
With three or more it's pure luck! :)
rogerd
07-20-2004, 12:15 PM
will it get penalized?
There's no penalty for dynamic URLs. However, some people argue that dynamic URLs tend not to perform as well as an equivalently linked static pages. They may have lower toolbar PR, and they may (or may not) rank a bit lower. I've seen plenty of well-ranked dynamic pages, though, and it's often hard to compare real world performance of static vs. dynamic since things are rarely equal. On one site I work with dynamic pages seem to outperform static pages in Google, though in other SEs the static ones do better. That Google performance doesn't seem typical, though.
The other issue is the number of query parameters (& perhaps URL length). The general thinking on this is that fewer parameters is better, with one being optimal (after zero, of course). Three short ones as you illustrate may be OK, but I certainly wouldn't go longer. I'd also be try to present the parameters in the same order each time. Overall, I'd still lean toward rewriting in static format so that I wouldn't have to worry if three parameters was pushing the limit of good indexing/ranking. There are SEs other than Google, and not all seem as good at deciphering dynamic URLs.
Nick W
07-20-2004, 01:05 PM
Hmmmm.... my post seems to have run away?........
Let me see if I can recall what I'd said (although I'll now be repeating the good rogerd lol!)...
Right, let's get somthing straight ;)
You will NOT get penalized for dynamic urls. The SE's have a whole host of issues they like to terrify us poor webmasters with, but this ain't one of them, so relax ;-)
Ahhh... Now I feel better! Dont you just hate it when you go back to look at a thread you're involved in and find you've mucked up your last post? (or lost it entirely... hehehe)..
Moving On....
Is there any way in IIS/.NET you can include your parameters in the url a different way? In Apache/PHP all I do is write this:
example.com?a=34&b=456&c=789
into this:
example.com/34/456/789
but it could be done like this:
example.com/34.456.789.html
Anyway, check it out, ask around, if it is possible then it'd be a good way to do it, beleive me...
Nick
rogerd
07-22-2004, 12:53 PM
Is there any way in IIS/.NET you can include your parameters in the url a different way?
Nick W, for IIS you'll need a third party rewrite program - ISAPI Rewrite seems to be the most popular. You can also do a rather klugey thing with custom error pages and ASP - this could be useful if you can't intall a server-level program like ISAPI Rewrite. In a nutshell, you link to a nonexistent static page, like example.com/Param1/Param2.html, and your custom error page interprets that by parsing the URL into the desired parameters and delivering the appropriate dynamic page. The code also has to return the correct server headers so that it's all completely transparent, i.e., visitors and spiders just see the static looking URL and don't get any error codes. Server level rewriting is almost certainly easier if you can do it, and it shouldn't play havoc with your logs.
Mikkel deMib Svendsen
07-22-2004, 12:55 PM
I very much agree that the "404-trick" is not the best way to go. Actually, it's long way down on my list of possible indexing solutions :)
Marcia
07-27-2004, 03:45 AM
I'm looking at an .asp site now that's got some isapi filter operating that changes the pages to static looking .htm pages. However, for each page there are multiple URLs being displayed - and indexed. There are many hundreds of pages indexed, well over a thousand, and the vast majority of the site is in the Supplemental Index, not the regular index. It's apparently been hit for duplicate content. Not a cloaking penalty by any means, but a black mark nevertheless.
Even if you travel up the breadcrumb navigation from within the different departments and product sections, you'll get a different URL for the same pages each time. Even the homepage has many, many URLs showing for the same content.
I'm really concerned about this and how to approach it, and wondering if there's a permanent mark against the site. Is there a need to exclude all but the few simple root level pages from bots, or would proper application of the ISAPI rewrite filter do the job properly.
Mikkel deMib Svendsen
07-27-2004, 04:56 AM
Marcia, it does indeed sound like a bad implementation of the filter. It looks like there is some kind of session or user dependable variable (e.g. a session ID) that is being translated by the filter too.
This is a good example of why you have to be very carefull when you implement an URL-rewrite filter. If you do not fix other problems on your site, e.g. duplicate pages, before you URL-convert, you will just end up with a stattic version of all your problems and that won't do any good.
You will have to go back and find out what variable is coursing this problem. With the current conversion you will most likely run into indexing problems.