Special thanks to:
|
#1
|
||||
|
||||
|
How Do the Search Indexes Work?
There was a thread recently started where a member asks how can Google report "Searching 4,285,199,774 web pages", when in fact, if you do a search on the at Google, you get 5,800,000,000 results. That is a difference of 1,514,800,226, which seems high.
In addition the figure 5,800,000,000 seems a bit too round. My question here is how do most search indexes work? From my understanding, Google spiders, indexes a portion of those pages and then people do searches on the index. Now, I would imagine that Google has indexes for all sorts of data. Index for linkage, index for text, index for PageRank and other indexes. An other question I have is: Does Google remove duplicate content from its index or does it just filter it out in the algorithm when a search is being done. I tend to believe that removing dup content from the index is more efficient, but then when you see how Google's search works, the results filter out dup content on the fly (do a site:www.domain.com search and you will find many duplicate pages on some sites). Lets talk search index technology here. ![]() |
|
#2
|
|||
|
|||
|
a while ago Chris Ridings did a brief overview
http://chriseo.com/modules.php?op=mo...rder=0&thold=0
__________________
The SEO Book |
![]() |
| Currently Active Users Viewing This Thread: 1 (0 members and 1 guests) | |
| Thread Tools | |
|
|