PDA

View Full Version : Will search engines ever understand ...?


Mikkel deMib Svendsen
06-08-2004, 07:52 PM
There are, in my opinion, two things that search engines in general do not do (or do not do very well) today and that I think could improve search dramatically: A) Understand the true meaning of the words in the documents they index and the queries users perform , and B) Understand the individual user (personalization).

I am pretty sure search engines will become better at both issues over time. The question is, how that impacts our long term dependant strategies in SEO – for example linking strategies.

My guess is that search engines will become much better at understanding the true meaning of documents and queries. If they do so, one of the logical steps would be to assign much higher link popularity values to links coming from truly related pages. Is this something you are already incorporating in your strategies today? Or are you more focused on (any) high PageRank pages?

Another thing search engines could do with better natural language understanding would be to separate (or filter) commercial and non commercial pages. Is this a good (long term) argument for incorporating as much editorial or user contributed content to any product or corporate site to “organically” make it blend more naturally with the rest of the web? Could this possible prevent a commercial website to completely disappear in a non-commercial index?

Anyway, I really just wanted to discus, in general, how you think these (and other) new technologies should be dealt with and incorporated in long term strategies now – or if it’s simple better to wait and react on actual changes as they happened?

rustybrick
06-08-2004, 08:08 PM
There are, in my opinion, two things that search engines in general do not do (or do not do very well) today and that I think could improve search dramatically: A) Understand the true meaning of the words in the documents they index and the queries users perform , and B) Understand the individual user (personalization).
I think search engines are already focusing on these areas and will be offering more advancements towards understanding keywords within documents and personalization. Since Google is one of the more popular engines, lets talk specifically about their efforts:

(a) Understanding True Meaning of Documents: I think this is where links, authoritative sites and hubs come into play. Its not only important to look at ones keywords and then a generic look at links. Many believed that after the Florida update, Google instituted something like Hilltop or Dan Thies's Topic Sensitive PageRank. I go into both of these concepts in more detail here (http://www.seroundtable.com/archives/000075.html). But I think they are on track.
(b) Google personalization: http://labs.google.com/personalized/
I tested it out a bit by setting my profile as: Internet, 1960s, 1970s, Rock, New York. Then in hope to find music on the Internet from a local new york store that focused on rock from the 60s and 70s, I did a search on music. The first result contained the copy "If you've tired of the music scenes in Seattle and New York's East Village, the only place to turn is a Web server in Finland." New York is in the copy, but this site is explicitly about being tired of the NY music scene, plus its on all genres.

My guess is that search engines will become much better at understanding the true meaning of documents and queries. If they do so, one of the logical steps would be to assign much higher link popularity values to links coming from truly related pages. Is this something you are already incorporating in your strategies today? Or are you more focused on (any) high PageRank pages?
As I said above, I think the Florida changes or theories encouraged (should replace with a different word) most SEOs and link builders to think this way.

Another thing search engines could do with better natural language understanding would be to separate (or filter) commercial and non commercial pages. Is this a good (long term) argument for incorporating as much editorial or user contributed content to any product or corporate site to ñorganicallyî make it blend more naturally with the rest of the web? Could this possible prevent a commercial website to completely disappear in a non-commercial index?
Well an other theory developed during Florida and then brought back up late May was this "commercial filter." This was knocked down but think about it. If someone searches on search engine optimization, the search engine probably wants to serve up this site first and not a site selling SEO services.

Nice topic to start!

rcjordan
06-08-2004, 09:43 PM
>Is this a good (long term) argument for incorporating as much editorial or user contributed content to any product or corporate site to “organically” make it blend more naturally with the rest of the web?

Personally, I like hedging my (long term) bets even more by building stand-alone "soft" sites that are primarily non-commercial and then sponsoring them with the hard site. But to answer your question, yes, I can't see how this could be anything but helpful, particularly if the algo also picks up on the backlinks that should result from this type of strategy.

>Could this possible prevent a commercial website to completely disappear in a non-commercial index?

I can't guarantee that 'softened' commercial sites with qualify for an index using NLP, but I know for certain that they already get listed by editors whose sites (gov, mil, edu) have very stringent guidelines restricting commercial links. It would stand to reason that what's good enough for a human editor should be acceptable for good NLP.

Daria_Goetsch
06-08-2004, 10:48 PM
My guess is that search engines will become much better at understanding the true meaning of documents and queries. If they do so, one of the logical steps would be to assign much higher link popularity values to links coming from truly related pages. Is this something you are already incorporating in your strategies today? Or are you more focused on (any) high PageRank pages?

Yes, my focus typically is on topic-related websites for linking whenever possible. I've had success submitting topic-related articles to a website that is on the same topic, which provides high Google PageRank for the articles I have archived there. The website archiving my articles also comes up very high in the search results for one of the same keyword phrases I am targeting.


Another thing search engines could do with better natural language understanding would be to separate (or filter) commercial and non commercial pages. Is this a good (long term) argument for incorporating as much editorial or user contributed content to any product or corporate site to “organically” make it blend more naturally with the rest of the web? Could this possible prevent a commercial website to completely disappear in a non-commercial index?

I certainly think it would be possible to add enough non-commercial content to your website and continue to show up in the non-commercial results. The non-commercial content is good info for your visitors, builds the size of your website, is good for optimization purposes (more pages = more keywords) and visitors get something for free. Sounds like a win-win situation.

Rob
06-10-2004, 02:31 PM
I agree that understanding how a user searches is probably their greatest drawback.

If a person searches for "search engine marketing" how does an engine know if this person is merely researching on how to do it? or looking at prospective SEM companies? or in the mood to hire someone?

This is where search engines need to get better - by understanding more precisely what the person needs and serving the content to match.

Much like geo-targetting in that its personalized, but the search engine has to first understand the users habits - do they traditionally research online, or do they purchase. This is where MS might have an edge, because that type of data collection could easily be built into windows and across applications.

I'm not going to get into the whole privacy issue, but to achieve that degree of personalization will require some form of data collection.

Only then can a search engine provide truly relevant results.

Alavina
06-10-2004, 05:13 PM
A) Understand the true meaning of the words in the documents they index and the queries users perform

I wish, because it would rid most of the spam. Will it? I doubt we'll see this soon. However, since spam is a serious problem for the engines, I reckon it's an area we will see AI sooner than elsewhere.

B) Understand the individual user (personalization).

No, but they probably will atempt to. Personalization will definitely be a thing of the next few years, particularly local search. The difficult thing about true personalization is the many facets there are about people. I might be interested in classical music, but then I do a search for a friend... My chilren might quickly use my PC... I do a quick search for my work at home...

Sure there are ways to deal with this, but how much are users willing to give for slightly relevant results. After all, I don't have tell my search engine what I do if I can do a wise query :) .

I'm sure we'll see more AI, but good queries and human intelligence aren't quite done with yet...

K.S. Katz
06-10-2004, 06:13 PM
...by dividing the Web into local subject communities, Teoma is able to find and identify expert resources about a particular subject. These sites feature lists of other authoritative sites and links relating to the search topic.

I believe that Teoma is doing a lot to understand the true meaning of documents with their search technology.

As far as personalization, there's probably going to be services in the future that are devoted to cataloging people's interest and serving sites that appeal directly to them. Much like Voice Recognition software, it will probably be rough in the beginning and improve over time.

Alavina
06-13-2004, 04:51 PM
As far as personalization, there's probably going to be services in the future that are devoted to cataloging people's interest and serving sites that appeal directly to them. Much like Voice Recognition software, it will probably be rough in the beginning and improve over time.

That's what Amazon is up to, with their a9.com... get even more information on their customers.

AussieWebmaster
06-14-2004, 12:12 AM
They have a good tie in with Alexa and provide decent added value to the Google results.

pleeker
06-14-2004, 03:02 PM
That's what Amazon is up to, with their a9.com... get even more information on their customers.

I think A9 is closer to the "right" track than what I've seen of Google's personalized search -- which doesn't yet appear to be an Alpha product, much less a Beta.

B) Understand the individual user (personalization).

The problem will be privacy. I'm enjoying trying out A9 and letting it learn more about me because I want to see how good it gets at providing me what I want. I don't mind having my searches remembered and my clicks tracked -- it's part of the system.

But there are too many people who want personalized results, but aren't willing to give up any personal data in the process. For personalized search to work, the SEs have to make clear the benefits of turning over your personal data. And even then there will be many who aren't willing to do it.

bwelford
06-14-2004, 03:38 PM
I think the problem is that at different times any one person may be in different modes with different search requirements. To an extent Teoma tries to suggest ways you might go by the additional resources it offers in parallel with giving you the search results you asked for.

I had an adverse reaction when Google first launched its Personalized Search. I felt what was really needed was an interactive cycling process where I and Google could work together to home in on what I was looking for. I described that in a short piece I wrote called Google, I want to search WITH you! (http://blog.cre8asite.net/bwelford/index.php?id=P63)

pleeker
06-14-2004, 04:32 PM
I felt what was really needed was an interactive cycling process where I and Google could work together to home in on what I was looking for.

The control panel is an interesting idea and as a search geek, I might be inclined to try it out (like I'm doing with A9). I wonder if Joe Surfer would be willing to go to the extra effort of tweaking and refining the options in order to get the desired results. Or do they prefer the simplicity of the tennis analogy -- lobbing shots back and forth with the search engine?