Douwe Osinga's Blog: The Future of Searching

Friday, March 26, 2004

A number of emails from different people got me thinking about the future of search. The Third Era of Searching, opened by Google, is drawing to an end. The new era might prove as disruptive as the other three.

The first generation of search engines (Lycos, Excite, Infoseek) did little more than match pages with keywords. There were not that many pages (remember when there was a page on yahoo: what's new, featuring all new websites of that week?) and getting a list of all pages containing travel AND amsterdam was usually enough search power. If you wanted more than 10 results, you had to pay.

This didn't last that long. The World Wide Web grew really fast and the search engines of old started to go down in the results. Spidering the whole web got very slow and returning all pages containing amsterdam AND travel no longer did the trick; too many results were only slightly relevant.

Hotbot and Altavista were the new kids in town. Complex algorithms decided from now on which page scored how well for which search terms. The spiderbots were much faster, reducing the time between submission and appearing in the list. All seemed well again.

But two things happened. Search Engine Optimization became big business. Companies started specializing in reverse engineering the ranking algorithmes of search engines. These companies then used their findings to create pages that would score perfect for certain keywords. The result was that search engines no longer returned the best matches, but the pages that were best optimized. At the same time, portals became the cool thing. No longer was it enough for Altavista to offer the best search engine, they also wanted to offer the latest news, horoscopes, stock quotes and weather information. And lots of colorful & moving advertisements, of course. And all of this on the same page.

Search seemed broken until Google opened its doors. Gone were the graphic-heavy pages with lots of information. Just a search box with two buttons. Search and I feel lucky. Google also introduced their pagerank algorithm, which took search engine optimization out of the equation (at least for a while). From now on, only the most relevant pages were returned. Google didn't just look at the keywords on the pages self, but even more at the links pointing to this page and the keywords surrounding that link. The search engine optimization companies could modify the pages all the wanted, but could hardly influence the links pointing to the pages.

But the growing importance of a good Google ranking made it worth it to go the extra mile. People started link-farms, websites that didn't do anything else than link to other websites, thus increasing the pagerank of those sites. People started exchanging links just for the increased Google rank. And building pages became even more an art - not only did you have to optimize the content, you now also had to pay attention to your links, incoming and outcoming.

Google is putting up a brave fight, but I think the end of the third era is neigh. It is hard to create a link-sphere around your site, but not impossible. In the end market forces will wreck the algorithm.

The next generation search engines will have to immunize themselves completely against search engine optimization. This seems very hard, but I see a few options:

  • Clicks on search engine results. If I see a page of returned results from Google, I can usually see from the description whether the page is really relevant or just a bit. And if it seems relevant, I click the links, otherwise I won't. A search engine could just check which pages are clicked more often for which search terms and push the ranking of those pages for those search terms up.

  • Page popularity. Google takes the links to a page as votes for that page. But that is rather indirect. Why not return the page that has the most visitors and is relevant for Amsterdam Travel (based on the content), when somebody searches for Amsterdam AND Travel.
    Sure, this will create a self perpetuating situation, because popular pages will get the lion share of search engine traffic, but pages with a lot of visitors are usually better than the ones with almost no traffic. And it is very hard to fake, at least much harder than links to a page.
    How would a search engine go about this? Well, they would need something on the computer of the user that could measure browsing habits, something like... the Google Toolbar. Makes you wonder.

  • Personal Search. If two people search at a search engine for something using the same keywords, it is not necessarily the case that the same pages are the most relevant for these two people. Some people would like pages about the great museums of Amsterdam, for others the local tolerance of some substances is more a reason to go. From clicking and searching, a profile could be build up, returning more specific results (and ads).
    This of course has tremendous privacy implications and would very well not be acceptable for a lot of people, but it is an idea.

So will Google lose their spot as top-dog? It is possible. MSFT is out there to conquer the search engine space, while AOL and Yahoo! still control a lot eyeballs. And then there are the outsiders. Of course a new startup might develop startling new technology and sweep everything away, just like Google did before them. But I wouldn't rule out Amazon either. Amazon bought Alexa. Not only does Alexa have a large database of opinions about what people think about websites, they also measure actual browsing behavior. It would not be that hard to build a very competitive search engine around Alexa. Amazon has a history of expanding into other things.