PDA

View Full Version : Google and htaccess


leeparts
12-24-2005, 10:49 AM
Hello All,
I have a question for you all. We recently (Yesterday) added to our htaccess file. We had long php urls and wanted to make them search engine friendly. Yahoo and MSN instantly starting crawling the new urls. I have searched my code and the site atleast three times over. Google is still crawling the old urls and I can not find a spot on the site where it can follow the old urls.

Here is an example of an old url:
http://www.leeparts.com/index.php?area=1&itemid=626&rootcatid=27&catid=6

Here is a new url:
http://www.leeparts.com/item-1-27-6-626.html

Now even if Google crawled one of the old urls, the new urls take over with any links on that page. Does anyone know if Google caches the htaccess file until it is done crawling and will use the new htaccess file next time it comes by?

davelms
12-24-2005, 06:49 PM
What Google asks for is simply a) pages that it knows about already and b) new pages it detects (and a subset of b being old pages it has been told are now elsewhere). With that in mind...

1. htaccess file is server-side only and cannot be requested, so Google cannot ask for it, see it, hence neither can it ever cache it.

2. Your old URL is still returning a HTTP 200 OK, hence as it is already in the search engines' indexes and it technically still exists they will continue to request that page/URL unless you do something about it. You need to serve a redirect (eg 301) from the old URL to the new URL, or serve a not found (404) on the old URL, or block the old URLs in robots.txt or some other similar technique, etc, if you wish Google to stop requesting the pages it already knows about. Whether you *should* do any of these is up to your wants and needs; I probably wouldn't (although a 301 might find a place perhaps).

3. Any new URLs will be picked up by search engines as they are detected, it sounds like Google is a little slower than the other two mentioned and I don't know its average crawl frequency to advise. However, new pages on my site are crawled and indexed within 2 or 3 days maximum by Google.

leeparts
12-24-2005, 07:09 PM
Thanks for the reply. I guess I will have to wait it out. Both URL's end up in the same place, it's just Google has spent the last week doing a crawl of my items and none of the are showing in search results. When we redesigned the site, we did not change the domain name. We figured it must be the urls because our categories are showing up in Google, just no sub-categories or items. I considered a 301, but the site has over 1400 items and would take way too long. Again thanks for the clarification.

seomike
12-25-2005, 03:31 AM
Google won't be able to make sense your .htaccess file. If you have de-linked your old urls they will fall into supplimental results. All is driven by the live links. If you have all new urls they will be cached and in time will probably replace the old.

xmuskrat
06-20-2006, 05:32 PM
As of yesterday, all of the leeparts.com pages are back. All 3000.