Writes Ramesh Jain:
I feel that these engines are still following a dead-end path. They are pursuing information-centric approach. In information centric approach, the model used is that the user comes to the system with a precise question and the system has answer to that question so it provides the answer. This simple model was good for early databases and even for early search systems. Now, when I type a keyword and the system firehoses me with a list of 5 Million items listed 10 on a page, what am I to do except getting frustrated. Advanced queries are each a new query and as every normal user knows, they are not much help. So what is fundamentally wrong with these search approaches. Here are somethings that I think are fundamental to search in the rapidly expanding cyberspace where current search engines are starting to choke and are ultimately likely to choke and become useless, opening up the space for a new fresh approach by a graduate student, or a start-up, working somewhere maybe even today.
1. The engines have the simple model of information based on keywords. Use of ontological filtering to reduce the list helps but does not really solve the real problem. The real problem is that a set of keywords are not semantically enough to express the context in which search is being done.
2. In most cases people have multidimensional search in mind. The simplest and most common example is seen in LBS (Location Based Services) where a keyword must be combined with location to provide the context. This then gives more meaningful answer to a user. Now LBS is not useful only in the context of mobile phones. The main point is providing dimensionality to search. Dimensionality provides different constraints that further helps in expressing context. Current search engines primarily store a link to the sources containing the keyword. This model is too simplistic to allow and match relations leading to context. By developing more rigorous models and extracting and storing all essential information, search can be more contextual.
3. The search environment based on current approach of specifying keywords and getting a list of pointers in return does not scale-up. People are interested in solutions to their problem, not knowing that 5 Million pages on web contain their keywords. The environment should be more exploration based, rather than query based. In exploration based environment, I will have freedom to slice and dice my results in many different ways to explore and find what I am looking for. Also, this exploration must be multidimensional.
4. Current approach of showing a list is too primitive. We never like to look at a table of stock values over time we look at charts. Two-dimensional data (stock value, and time) when reduced to a liner representation, theoretically contains the same information but practical is useless as the number of data item increases. After about 100 items, it is for all practical purposes completely useless. We need to develop powerful visualization mechanisms. This is not a simple point. Mathematically we know that when a two-dimensional information is projected onto one-dimensional space (extend it to higher dimension) there is loss of information. Same is true for human mind in fact that is more true for human mind.