View Full Version : Limit on Special Characters?
Kristina
06-18-2004, 05:58 PM
Hello there!
Great forum :) First post...
Is there a limit of what the search engine spiders will crawl on dynamic URLs? For instance, will they only crawl up to a certain number of those special characters (such as "/").
I don't recall reading about one but I was asked this question today and wanted to confirm it.
Thanks!
seobook
06-18-2004, 09:48 PM
with dynamic urls they tend to crawl more if you have greater inbound link popularity.
most of what I have been told is try to limit to 3 or less parameters and try to keep the parameters under 10 characters each.
spiders are getting better at indexing content every day though.
Kristina
06-18-2004, 09:53 PM
That's what I have found... We are using ASP.net and back in Feb. we finally did an inhouse rewrite of the script (what is the official word?? lol) where we changed all the question marks (?) into slashes (/).
There was a discussion saying that people wished the dynamic pages generated with their unique content were shorter URLs.
If the URL generated was a rewrite anyway, I don't think there is any special length to what they would read, is there?
Ex:
http://www.MyDomain.com/FolderHere/Default.aspx/PageID=12345
Or:
http://www.MyDomain.com/Page.aspx/Page=/FolderHere/Default.aspx/PageID=12345
There is no LIMIT on the amount of characters it will crawl? :) Seems like a silly question to me now but I just wanted to confirm.
rustybrick
06-19-2004, 10:52 PM
I do not believe Google or other engines will have problems with long URLs as long as they do not contain many dynamic parameters (?, =, &, % etc.).
pageoneresults
06-20-2004, 01:01 AM
The general consensus is that you should do a full rewrite on your URIs. Don't just replace characters, build the URI paths so they are user friendly. Eliminate variables where possible and yes, shorter paths are always the best.
Google was the first on the scene to spider dynamic content. Unfortunately, much of that content is invisible. Even though it has been spidered, it still lacks the necessary elements for high placement. URI strings with IDs are a no-no. Actually, URI strings that are not pure should be avoided. When I say pure, I mean void of all special characters.
www.example.com/category/
www.example.com/category/product/
www.example.com/category/product/details/
No query strings, no IDs, no =, no nothing but pure text URI strings. If you have the ability to do Content Negotiation take it to that level. Eliminating the file extensions is just one more step in the overall URI rewrite process.
Mikkel deMib Svendsen
06-20-2004, 07:45 AM
There is a maximum URL length but in most cases you won't reach that. For XML-feeds I think it's 256 characters and I do not think Google does more than 1024
However, the problem you usually have with dynamic websites, as pointed out by others too, are the number of parameters. In my experience we get perfect indexing with just one parameter. Over 50% indexed with 2 parameters and with 3 or more it's pure luck.
Kristina
06-21-2004, 01:13 AM
Thanks so much for the different answers. I appreciate it!
We were going back and forth with different URLs that are being indexed and receive PR on them and how they pass it along. Then the subject came up regarding having shorter URLs would help get indexed... but if the URLs are rewritten and there is no parameters in them, then it doesn't matter.
Also, I've been coming accross different directories that show a link... but will pop you to a different page using java script. It's not in anyway cloaking... but it may be that they are using a single page and Google found that page... and then it uses a bit of java script to pop to an iFrame so it doesn't end up an orphan page. It's interesting how these single pages are indexed and show PR... yet the actual page that is shown to the end user doesn't show PR so it doesn't seem search engine friendly to them...
Mikkel deMib Svendsen
06-21-2004, 08:39 PM
I found it to be very important to settly on an architecture and then use it for as long as possible. Find out what works best for you and stick to it. If you make changes too often you confuse the engines and loose important inbound links, in my experience.