randfish
12-30-2004, 03:06 PM
Web pages that are essentially collections of automatically scraped results are currently easily identifiable to most web using veterans. However, they are currently not excluded entirely from web results.
The future, no doubt, will bring more advanced methods of automatically identifying and removing web spam from top results at search engines. It's likely, therefore that those who create the spam will always have to stay one step ahead. Even today I see more and more advanced "spam" pages that are more and more difficult for me to immediately identify as machine created rather than human built.
Following this logic, I come to the conclusion that one day, a "spam" page and one with valuable, unique content will become virtually impossible to detect for either human or machine. Is this the direction we're heading in? What could hold off this course?
The future, no doubt, will bring more advanced methods of automatically identifying and removing web spam from top results at search engines. It's likely, therefore that those who create the spam will always have to stay one step ahead. Even today I see more and more advanced "spam" pages that are more and more difficult for me to immediately identify as machine created rather than human built.
Following this logic, I come to the conclusion that one day, a "spam" page and one with valuable, unique content will become virtually impossible to detect for either human or machine. Is this the direction we're heading in? What could hold off this course?