There have been three versions of search engines in the Internets first mass-usage decade. The first search was actually Yahoos directory with sites handpicked by editors. This was fine until the number of websites werent very large. As the Web grew, the limitations of the directory approach became apparent. Along came Altavista which used a crawler to get web pages and run indexing algorithms on them. This allowed for keyword-based searching. This era lasted a while until smart webmasters figured out ways to get their pages to show up in the top of the results list by artificially inserting words into their pages.
This problem was addressed to a significant extent when Google launched its search engine using PageRank technology which ranked pages based on incoming links a measure of authority. This immediately improved the relevance of the results. While there have been some incremental modifications, for the most part, the PageRank technology serves as the base on which most of todays leading search engines have been built. From Yahoo to Altavista to Google, the focus has been on providing the most relevant results in the quickest possible time to information-hungry users.
In the five years or so since Googles launch, there have been plenty of new developments in the world and Web around us. As we think of next-generation search, it is important to understand the changing nature of information and usage so we can build up a new model which can then help provide insights into the characteristics of next-generation search engines.
The five most important developments in recent times have been: user-generated content, RSS, mobile phones, broadband and internationalisation. We will look at each of these.
1. User-Generated Content
For much of our history, content has been created by few for consumption by many. This has been because access to the tools for content creation and mechanisms of distribution have been limited. The Internet changed the economics of distribution anyone could use its global reach to disseminate content. But the tools for content creation were still not easy for mass-market usage.
That has now started to change. Beginning with do-it-yourself publishing via weblogs to image capture via digital cameras and mobile phones, new content is now being created by millions. While the earlier model was that of a few creating for many, it is now many creating for few. The blog that I create or the photos that you take may be limited to only a very small set of people but they are people who are important to us.
The latest meme in user-generated content is Podcasting. The New York Times wrote recently: [P]odcast [is] a kind of recording that, thanks to a technology barely six months old, anyone can make on a computer and then post to a Web site, where it can be downloaded to an iPod or any MP3 player to be played at the listener’s leisurePodcasts are a little like reality television, a little like Wayne’s World, and are often likened to TiVo, which allows television watchers to download only the programs they want to watch and to skip advertising, for radio or blogs but spokenAnd as bloggers have influenced journalism, podcasters have the potential to transform radio.
Another interesting bottom-up example of user-generated content is the tagging that sites like Del.icio.us and Flickr are supporting. Users can tag any kind of content and then share it with others. Micro Persuasion wrote: “Tags are a natural complement to search because they empower users to create structures that organize unstructured consumer-generated media.
Tomorrow: Whats Changing (continued)