PDA

View Full Version : extended dynamic paths & indexing


newreality
08-06-2006, 11:57 PM
I have same pages which are accessible from different parts of the site.
These are results pages that call up the rowid and related values, ex:

mysite.com/details.php?MID=1707&City=Atlanta&state=GA&re=dir1


Here, the 're' url variable is used to send back the user to the originating page.

Question is, with Google and other main engines indexing dynamic pages, what happens when I change the 're' value alone (per originating page) and all else remains the same?

Could this be viewed as duplicate results? Would it be indexed multiple times?
Can google infer the rowid alone and designate it, as a discreet value?

Is it advisable to keep the entire path exactly the same wherever it is bing sent from?

g1smd
08-09-2006, 11:12 AM
The same content at multiple URLs... that is exactly what duplicate content is.

Make sure that each piece of content has only one canonical URL that can be used to reach it.

If you have to have more than one URL, at least get the script to put a <meta name="robots" content="noindex"> tag on all of the other copies.

Andrey Markin
08-10-2006, 05:36 AM
Is it advisable to keep the entire path exactly the same wherever it is bing sent from?
Yes! And it would be better if you use some rewriting and convert all your links to static... :rolleyes:

newreality
08-10-2006, 09:48 AM
I already have, at considerable effort.

-not a big fan of the robots tag that can potentially stop a crawl on the site remainder.

newreality
08-13-2006, 11:53 AM
This is important, as I've redesigned (to others as well hopefully)
what about when I'm sending different url variables to the same page and some of these results happen to be empty // actully would show same line 'sorry no results for your (name) search'

can there be duplicate content on pages by url variable (beyond page tags)?

JohnW
08-13-2006, 12:08 PM
You could set it up so that any URLs that are not/can't be rewritten always show up in/under a robots blocked (use robots.txt) folder.

mysite.com/norobots/details.php?MID=1707&City=Atlanta&state=GA&re=dir1

newreality
08-13-2006, 12:34 PM
a) how would robots know when re=dir1 has no php/mysql results?
b) I've written all pages with city and state variables only, so all pages are indexed the same. (on 55+ pages)
c) I try not to use robots within meta - beleive it can stop crawlers regardless

---------------------------------

With pages having different url variables -- can the pages by url variable extension be labelled as 'duplicate' since they have little or no content?

Won't the main search engines kind of 'ignore these' until content arises?
How do they define duplicate content and does this apply in this situation?

JohnW
08-13-2006, 01:03 PM
>can the pages by url variable extension be labelled as 'duplicate' since they have little or no content?

yes they will be caught as duplicate.

>Won't the main search engines kind of 'ignore these' until content arises?

Not exactly. And once a page has been booted for duplicate content it is likely not ever coming back even if the contant improves, the page will need to get a new url and be a "new" page after you fix it.

>how would robots know when re=dir1 has no php/mysql results?

what I said was to set your cms so that all urls with variables show up in a folder that has a robots.txt disallow. Let only the rewritten urls be indexed.

newreality
08-13-2006, 01:11 PM
what I said was to set your cms so that all urls with variables show up in a folder that has a robots.txt disallow. Let only the rewritten urls be indexed. ...show up in a folder that has a robots.txt disallow -- what is this? Also, what if some of the values in the path have been dynamically echoed?


what do you mean by a rewritten url: Let only the rewritten urls be indexed.

g1smd
08-13-2006, 03:04 PM
Make sure that only "short" URLs, URLs without the final parameter, are indexed.

Do that by having a <meta name="robots" content="noindex"> tag on all pages accessed with "long" URLs (URLS with the extra parameters).

newreality
08-13-2006, 04:09 PM
how can I do that when I have 50+ job categories across 50+states being access on the same search page?

this is a main search page and I don't want tp exclude that from robots index