Marcia
07-24-2004, 12:42 AM
In discussing site architecture and internal linking structures, there's often question raised of whether it's best to have all files in the root directory of sites or to construct sites with a hierarchical structure using /subdirectories/
Questions can arise on whether the physical location affects spidering sequences by search engines, and probably more importantly, whether one structure or the other can ultimately affect search engine rankings, particularly in view of the relative importance of having keywords or keyword phrases in the file path or for individual filenames.
I've never seen this particular paper discussed, and while it isn't an SEO-related paper in the strict sense of the word, but is rather about constructing maps, I've read it over many times and wondered whether the concepts presented might be important to take a further, deeper look at.
The major concept is that of differentiating between Physical Domains and Logical Domains.
In this paper, we present a technique for automatically constructing multi-granular and topic-focused site maps using trained rules based on Web page URLs, contents, and link structure. In these site maps, the Web site topology is preserved, and document importance, indicated by semantic relevancy and citation, is used for prioritizing the presentation of pages and directories.
Constructing Multi-Granular and Topic-Focused Web Site Maps (http://www10.org/cdrom/papers/461/)
And the first of 4 points defining and summarizing some of the major points of the study
Identifying ``logical domains'' within a Web site: A logical domain is a group of pages that has a specific semantic relation and a syntactic structure that relates them.
A few of the questions raised could be:
1. How does this affect indented listings with Google, if at all?
2. What about sites that are actually part of a larger site, like Geocities or ISP sites or even Yahoo Stores. Can search engines tell that they're different sites?
3. In view of recent interest in the Hilltop Algorithm (http://www.cs.toronto.edu/%7Egeorgem/hilltop/) and the possible effects of affiliations for scoring, how can hosted sites such as the subdirectory type Yahoo Stores or ISP sites fare, in few of the fact that the actual rightmost unique token is the same for all of them?
4. Do search engines take into consideration for backlinks whether or not a page is the root directory index page, or the index page of a subdirectory?
5. What effect would use of logical domains, if the concept were utilized, have for engines that utilize clustering, as AllTheWeb was doing prior to the Yahoo acquisition?
For those among us who are inclined toward top-down, keyword-oriented site navigation, especially those who have favored the use of subdirectories, it might be worth taking a second look occasionally, particularly when there appear to have been major algo changes.
We have developed a set of rules for identifying logical domain entry pages based on the available Web page metadata, such as title, URL string, anchor text, link structures, and popularity indicated by citations.
How important is it whether the entry pages for given keyword searches would be in the root or in a subdirectory? Could a tightly themed subdirectory within a site be considered a "logical domain" on its own, being given relevancy by linking structure within the site or inbound deep-links and be of help for achieving higher rankings than a root level page would get?
Has anyone seen any evidence that there's any weight given to different physical or logical locations within a site's architecture, or whether inbound links have carried different weight depending on where within the structrure of the linking site they're located?
Questions can arise on whether the physical location affects spidering sequences by search engines, and probably more importantly, whether one structure or the other can ultimately affect search engine rankings, particularly in view of the relative importance of having keywords or keyword phrases in the file path or for individual filenames.
I've never seen this particular paper discussed, and while it isn't an SEO-related paper in the strict sense of the word, but is rather about constructing maps, I've read it over many times and wondered whether the concepts presented might be important to take a further, deeper look at.
The major concept is that of differentiating between Physical Domains and Logical Domains.
In this paper, we present a technique for automatically constructing multi-granular and topic-focused site maps using trained rules based on Web page URLs, contents, and link structure. In these site maps, the Web site topology is preserved, and document importance, indicated by semantic relevancy and citation, is used for prioritizing the presentation of pages and directories.
Constructing Multi-Granular and Topic-Focused Web Site Maps (http://www10.org/cdrom/papers/461/)
And the first of 4 points defining and summarizing some of the major points of the study
Identifying ``logical domains'' within a Web site: A logical domain is a group of pages that has a specific semantic relation and a syntactic structure that relates them.
A few of the questions raised could be:
1. How does this affect indented listings with Google, if at all?
2. What about sites that are actually part of a larger site, like Geocities or ISP sites or even Yahoo Stores. Can search engines tell that they're different sites?
3. In view of recent interest in the Hilltop Algorithm (http://www.cs.toronto.edu/%7Egeorgem/hilltop/) and the possible effects of affiliations for scoring, how can hosted sites such as the subdirectory type Yahoo Stores or ISP sites fare, in few of the fact that the actual rightmost unique token is the same for all of them?
4. Do search engines take into consideration for backlinks whether or not a page is the root directory index page, or the index page of a subdirectory?
5. What effect would use of logical domains, if the concept were utilized, have for engines that utilize clustering, as AllTheWeb was doing prior to the Yahoo acquisition?
For those among us who are inclined toward top-down, keyword-oriented site navigation, especially those who have favored the use of subdirectories, it might be worth taking a second look occasionally, particularly when there appear to have been major algo changes.
We have developed a set of rules for identifying logical domain entry pages based on the available Web page metadata, such as title, URL string, anchor text, link structures, and popularity indicated by citations.
How important is it whether the entry pages for given keyword searches would be in the root or in a subdirectory? Could a tightly themed subdirectory within a site be considered a "logical domain" on its own, being given relevancy by linking structure within the site or inbound deep-links and be of help for achieving higher rankings than a root level page would get?
Has anyone seen any evidence that there's any weight given to different physical or logical locations within a site's architecture, or whether inbound links have carried different weight depending on where within the structrure of the linking site they're located?