Special thanks to:
|
#41
|
|||
|
|||
|
Quote:
What this patent does is show that not only has Google considered using historical data on a domain for ranking purposes, but that it was serious enough to include it in a patent that preceded first observations of sandboxing in the first place. The patent proves nothing of course, but is very suggestive. EDITED: Preceded, not proceeded (thanks Marcia) Last edited by I, Brian : 04-02-2005 at 06:49 PM. |
|
#42
|
|||
|
|||
|
If I were to speculate I would say much of this is laying the foundation for using Urchin stats to - in some way - enter into the PR/SERP relationship.
This would be my guess for using the team 'bookmark'? Many web stats software will use the term 'bookmark' when referring to a direct hit. They could now also track how long someone spends time on a certain page. Or did they just click a link and then hit back... |
|
#43
|
|||
|
|||
|
...continuing on with my thoughts, just thinking out loud here...
bookmark: a request for a page without a referrer URL it was from... Most often I vist my favorite websites directly, I didn't follow any link, I don't leave a referrer URL. In the past Google could not factor this into there algo. Was simply based on backlinks...and some other 'stuff'. So all those great sites that we all frequent could not be rewarded. But, what if now they could! enter Urchin... |
|
#44
|
|||
|
|||
|
Of note, Google made changes that occurred 2/2/05. I've noticed that local businesses that were not well optimized as we define them zoomed to top rankings above huge sites with enormous numbers of backlinks for local search phrases.
It seems to me that google identified the number and percentage of pages within a site that relate to the search phrase and rank the site higher based on the percentage of pages that reflect the search topic to the total. This is not identified in the patent from 12/31/03. Even as this document may well identify much of what google does it continues to adjust its algos. Dave |
|
#45
|
|||
|
|||
|
Quote:
Yes Brian but the recent patent by a group of Google employees (I would like to stress yet again that this patent is not a Google patent or assigned to Google - perhaps this means nothing or perhaps it's important) uses the age of the domain not only as a negative factor but as a positive factor and specifically mentions a brand new domain ranking above others because it had ten links at its inception date. This IMO is not very sandbox like. I think we have to be careful not to mistakenly ascribe something to the patent just because we need so badly to find explainations for the sandbox. The only connection between the two that I can find is that they both seem to be concerned with the age of a document, be it a page or a domain. |
|
#46
|
|||
|
|||
|
Someone at Google realised that various time factors might possibly be used in some way to improve the search results. So a bunch of them wrote down all the suitable time factors that they could think of - inception dates of domains, pages and links, last modification dates of those things, etc. etc. - and they patented them. That's all that the patent looks like to me. They covered a group of bases just in case the may be useful in the future.
They may have had some ideas about how they might use some of them, and they may have used some of them. What the patent doesn't do describe is the sandbox. I'm not a lawyer, but I don't see that much, if any, of the patent could seriously hold up anyway. It seems to me that they've tried to tie down the use of all historical data to help produce search results. Quote:
It looks like a silly patent to me, but nowhere near as silly as the one that gave some company a patent over DNA, or the one that BT tried to use to lay claims to all hyperlinks. This patent is much too general. The hyperlink one was more of an "invention" than this one is, and that one didn't hold up in court. |
|
#47
|
||||
|
||||
|
Quote:
Quote:
We have to realize how much attention this publication will get in the SEO community. And then, true to form, if Google by releasing this publicly in the form of a patent application, which is the *only* way it could have been publicly published, can program, by doing it, enough fear into the minds of a good number of SEOs, fear of doing anything that could be considered spam or result in a penalty, then they've accomplished a purpose just simply by submitting the application. This is the most clever, innovative press release I've ever seen. They're brilliant! |
|
#48
|
|||
|
|||
|
Quote:
What the patent shows is that even before sandboxing was first observed, Google had already been looking at ways of applying such data for ranking purposes. There's some correlation with observation, and that is certainly worth noting. |
|
#49
|
|||
|
|||
|
Quote:
Great post Marcia and my apologies for just assuming that this was an actual patent just becasue it was being discussed as one. It is of course just an application as you say and will be assigned to Google when it is issued. I am with you in thinking that this is at the very least a great press release able to create a cloud of doubt in the minds of the Webmaster public and thinking about it that way clears up a lot of the confusion that I feel when trying to understand how such conflicting applications could actually be incorporated into a real time search engine. While I do agree that Google likes to do things programatically, I do not necessarily believe this means algorithmically. It seems to me that Google generally ranks pages algorithmically but handles spam (such as hidden text and related site linking issues) and more computationally difficult matters (like PageRank) on a once in a while basis by running a program over the index and either modifying the index based on the results or storing the results for use in the ranking algo if necessary. |
|
#50
|
|||
|
|||
|
"It seems to me that Google generally ranks pages algorithmically but handles spam (such as hidden text and related site linking issues) and more computationally difficult matters (like PageRank) on a once in a while basis by running a program over the index and either modifying the index based on the results or storing the results for use in the ranking algo if necessary."
Seeing that Google is an information retrieval application it would make it unworkable to not use algorithms. In fact a plain keyword search is an algorithm. No applications work without an algorithm (A step-by-step problem-solving procedure, especially an established, recursive computational procedure for solving a problem in a finite number of steps - The American Heritage® Dictionary ). It would a lot of sense to have check sites and the index constantly because of the size of the index, and because results couldn't be kept clean so well without. There is still mess which turns up in the results but if you saw what they look like with no filter, you would appreciate how efficient the methods are. Storing things for use in the rankings is just corpus duplication, why bother? I fail completely to see how this patent validates "sandbox". The age factor is not at all a novel idea and has already been implemented in academic research. Using it in combination with other things is what the patent is about. You can patent a novel approach by using existing techniques. Age is never taken into consideration for new sites, it even says so in the patent. Sites are not subject to this if they don't fulfill certain conditions as yet. Remember this is only a small part of the proposed method. Like I said, its clean, simple and well researched already. Adding this to the existing setup looks good. The index has needed a good deep clean for a little while. Lets not forget other Google applications like Google scholar, basically a digital library, where age is certainly a factor. Last edited by xan : 04-03-2005 at 08:27 AM. |
|
#51
|
||||
|
||||
|
A lot of the calculation that goes behind any search engine today is build into the index - processed at indexing time. Other features are "real time" - executed at run time. It has been that way for many years. It is the only way, with the current computers available, that the complex calculating can work at an acceptable speed. And, for the most, that works fine - it produce acceptable results at a high speed.
However, I am sure that search engine engineers would love to have a trillion times the CPU than they have now so they could execute much more at run time. However, I don't expect we get that kind of computer power just around the corner ... but then again, who knows ![]() |
|
#52
|
||||
|
||||
|
Regarding the whole press release thing. I am sure many of you are aware of the patents that have been accepted. There are really stupid ones out there. Why not this one?
|
|
#53
|
||||
|
||||
|
Oh yes, absolutely. I've seen a few patents over time that was far more silly than this one. FAR more
![]() |
|
#54
|
||||
|
||||
|
This isn't silly at all. Inventions can have common elements - when you think about it, how many things have been invented that use electricity, but they're still unique in their purpose and process.
There's nothing silly about this; it's heavy artillery. |
|
#55
|
|||
|
|||
|
Quote:
Quote:
Last edited by projectphp : 04-03-2005 at 11:17 PM. Reason: Clarrity |
|
#56
|
||||
|
||||
|
Quote:
Having said that, the majority of what I found in this new patent, dosn't look too silly - but definately not something that fall under current European patent law ![]() |
|
#57
|
|||
|
|||
|
Quote:
I'm european, and agree that we do things differently than in U.S as far as patents go. This patent isn't silly. It a patent for using existing ideas and technology in a specific way. Its clean and sensible and what's more, it can wok quite well. |
|
#58
|
||||
|
||||
|
So the application was filed 12/31/2003 and it was published 03/31/2005 and rights to it are assigned to Google when it's granted, which will probably take another year.
IMHO it's being substantiated not only that there is a "sandbox" of sorts, for lack of a better way to express it, but that it's not only the domain itself, it's the history of the inbound links as well, as many have thought all along. And apparently, gone are the days when a placeholder page can be put up and linked to from a couple of a person's own pages to get a headstart. |
|
#59
|
|||
|
|||
|
Quote:
It's a gimme that search engines exist, and that algorithms sort the results according to certain factors - nothing new there. It's a also a gimme that measuring time is not a new idea. The only new thing in the application is the idea of applying one to the other. That's not an invention in my book, and it's far too general to merit a patent, but then some company got a patent on DNA, so stupidity sometimes prevails. To my way of thinking, it's as silly as patenting the idea of ranking documents according to how well the words on the page match the words in the query. I don't believe it substantiates anything about the sandbox. IMO, all it substantiates is that Google considers that time might be a potentially useful factor for ranking purposes. |
|
#60
|
|||
|
|||
|
There is a lot of stuff going on at the moment in the world of patent law, and there will be some interesting changes it seems.
Most patents are for improvements in existing technology; “Patents are generally intended to cover products or processes that possess or contain new functional or technical aspects; patents are therefore concerned with, for example, how things work, what they do, how they do it, what they are made of or how they are made.” (UK patent office) I wrote all about the software patent dispute at: Search Science Its a major issue at the moment for all who produce and work in software, and everyone is at odds about it, including all the large companies. As far as Google goes, well everyone forgot about Google scholar, and the methods describe a nice digital library system. Its not heavy duty stuff, its clean, simple, nice. |
![]() |
| Currently Active Users Viewing This Thread: 1 (0 members and 1 guests) | |
| Thread Tools | |
|
|